

PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 
java.lang.Object com.aliasi.classify.ScoredPrecisionRecallEvaluation
public class ScoredPrecisionRecallEvaluation
A ScoredPrecisionRecallEvaluation
provides an evaluation
of possible precisionrecall operating points and other summary statistics
The single method addCase(boolean,double)
is used to populate
the evaluation, with the first argument representing whether the response
was correct and the second the score that was assigned.
This evaluation does not consider negative reference cases.
If a positive reference case is not scored by a classifier being
evaluated, the method addMisses(int)
should be used to
increment the number of such positive reference cases not scored.
See below for more information.
By way of example, consider the following table of cases, all of which involve positive responses. The cases are in rank order, with their scores and whether they were correct listed. For this example, we assume each reference positive result is scored by the classifier.
Note that there are four positive reference cases (blue backgrounds) and six negative reference cases (clear backgrounds) in this diagram. By setting an acceptance threshold at the various scores, the precision, recall, and rejection recall values listed in the fourth through sixth columns are derived. For instance, after the rank 0 response, which is wrong, recall is
Rank Score Correct Rec Prec Rej Rec 0 1.21 incorrect 0.00 NaN
0.83 1 1.27 correct 0.25 0.50 0.83 2 1.39 incorrect 0.25 0.33 0.67 3 1.47 correct 0.50 0.50 0.67 4 1.60 correct 0.75 0.60 0.67 5 1.65 incorrect 0.75 0.50 0.50 6 1.79 incorrect 0.75 0.43 0.33 7 1.80 incorrect 0.75 0.38 0.17 8 2.01 correct 1.00 0.44 0.17 9 3.70 incorrect 1.00 0.40 0.00
0/4 = 0.0
, because we have retrieved none of
the four positive reference cases; similarly, rejection recall is
5/6 = 0.83
because 5/6 of the negative
reference cases have been rejected. After the rank 4 response,
recall is 3/4 and rejection recall is 2/6.
The pairs of precision/recall values form the basis for the
precisionrecall curve returned by prCurve(boolean)
, with
the argument indicating whether to perform precision interpolation.
For the above graph:
prCurve(false) = { {0.25, 0.50}, {0.50, 0.50}, {0.75, 0.60}, {1.00, 0.44} }
The pairs of recall/rejection recall values form the basis for the
receiver operating characteristic (ROC) curve returned by rocCurve(boolean)
with the boolean parameter again indicating
whether to perform precision interpolation. For the above graph,
the result is:
rocCurve(false) = { { 0.25, 0.83 }, { 0.50, 0.67 }, { 0.75, 0.67 }, { 1.00, 0.17 } }Note that for both curves, only the rows corresponding to the correct responses are considered, which are highlighted in blue.
Precision interpolation removes any operating point for which
there is a dominant operating point in both dimensions.
For the precisionrecall curve, the points (0.25,0.50)
and (0.50,0.50)
are dominated in both dimensions by
(0.75,0.60)
and so are dropped; the resulting curve is:
prCurve(true) = { {0.75, 0.60}, {1.00, 0.44} }This is meant to be read meant to be read as having a constant precision of 0.6 for all recall values between 0 and 0.75 inclusive; thus the interpolation increases values. For the ROC curve, only the three points highlighted in yellow are left:
rocCurve(true) = { { 0.25, 0.83 }, { 0.75, 0.67 }, { 1.00, 0.17 } }Note that the precision interpolated curves always provide strictly decreasing precisions.
The area under the raw precisionrecall and ROC curves, with or without interpolation, is computed by the following methods:
areaUnderPrCurve(false) = (0.25  0.00) * 0.50 + (0.50  0.25) * 0.50 + (0.75  0.50) * 0.60 + (1.00  0.75) * 0.44 = 0.51 areaUnderPrCurve(true) = (0.75  0.00) * 0.60 + (1.00  0.75) * 0.44 = 0.56The ROC areas are computed similarly to yield:
areaUnderRocCurve(false) = 0.58 areaUnderRocCurve(true) = 0.58Note that the precisioninterpolated values are always higher.
For precisionrecall curves, three additional summary statistics are available. The first provides an average over precision values over the operating points on the uninterpolated precisionrecall curve.
averagePrecision() = (0.50 + 0.50 + 0.60 + 0.44)/4.00 = 0.51The second merely returns the maximum F_{β} measure for an actual operating point:
maximumFMeasure() = maximumFMeasure(1) = 0.67Note that this statistic provides a posthoc optimal setting for Fmeasure. Further note that it is based on actual operating points, not interpolations between operating points. The final statistic is the socalled precisionrecall breakeven point (BEP). This is computed in the standard way by using the interpolated precisionrecall curve. Because the two points of interest are (0.0,0.6) and (0.75,0.6), the best point at which they are equal is (0.6,0.6), and thus:
prBreakevenPoint() = 0.6Note that this value will always be less than or equal to the maximum F_{1}measure.
Given the precisionrecall curve, it's possible to compute the
precision after any given number of results. The method
precisionAt(int)
will return the precision after
the specified number of results. For instance, in the above
graph:
precisionAt(5)() = 0.6 precisionAt(10)() = 0.4Typically results are reported for 5, 10 and 100 when available (counting from 1, not 0). The other informationretrieval style result we return is the reciprocal rank (RR), which is defined to be
1/rank
, again counting from 1, not 0. For
instance, for the graph above, the first correct answer is the
second, so RR is 1/2:
reciprocalRank()() = 0.5
The method addMisses(int)
provides a mechanism to
add counts for items that were missed by the system being
evaluated. Any reference true item that is not found by
the system and added through addCase(boolean,double)
should result in a call to addMisses(1)
. These
calls can be aggregated.
Missing cases will not arise from LingPipe's own classifiers, which always return scores for all results.
Constructor Summary  

ScoredPrecisionRecallEvaluation()
Construct a scored precisionrecall evaluation. 
Method Summary  

void 
addCase(boolean correct,
double score)
Add a case with the specified correctness and response score. 
void 
addMisses(int count)
Incrments the positive reference count without adding a return case from the classifier. 
double 
areaUnderPrCurve(boolean interpolate)
Returns the area under the recallprecision curve with interpolation as specified. 
double 
areaUnderRocCurve(boolean interpolate)
Returns the area under the receiver operating characteristic (ROC) curve. 
double 
averagePrecision()
Returns pointwise average precision of points on the uninterpolated precisionrecall curve. 
double 
maximumFMeasure()
Returns the maximum F_{1}measure for an actual operating point on the uninterpolated precisionrecall curve. 
double 
maximumFMeasure(double beta)
Returns the maximum F_{β}measure for an actual operating point on the uninterpolated precisionrecall curve for a specified β. 
int 
numCases()
Returns the number of cases that have been added to this evaluation. 
double 
prBreakevenPoint()
Returns the breakeven point (BEP) for precision and recall based on the interpolated precision. 
double[][] 
prCurve(boolean interpolate)
Returns the set of recall/precision operating points according to the scores of the cases. 
double 
precisionAt(int rank)
Returns the precision score achieved by returning the top scoring documents up to the specified rank. 
static void 
printPrecisionRecallCurve(double[][] prCurve,
PrintWriter pw)
Prints a precisionrecall curve with Fmeasures. 
double[][] 
prScoreCurve(boolean interpolate)
Returns the set of recall/precision/score operating points according to the scores of the cases. 
double 
reciprocalRank()
Returns the reciprocal rank (RR) for this evaluation. 
double[][] 
rocCurve(boolean interpolate)
Returns the receiver operating characteristic (ROC) curve for the cases ordered by score. 
String 
toString()
Returns a stringbased representation of this scored precision recall evaluation. 
Methods inherited from class java.lang.Object 

clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait 
Constructor Detail 

public ScoredPrecisionRecallEvaluation()
Method Detail 

public void addCase(boolean correct, double score)
true
if the reference was also
positive. The score is just the response score.
Warning: The scores should be sensibly comparable across cases.
correct
 true
if this case was correct.score
 Score of response.public void addMisses(int count)
count
 Number of outright misses to add to
this evaluation.public int numCases()
public double[][] prCurve(boolean interpolate)
rocCurve(boolean)
returns the recall (sensitivity)
versus rejection recall (specificity) operating points, which
take the number of true negative classifications into account.
Note that the recall values (the first component) are strictly
increasing, resulting in a welldefined function from recall
to precision.
The second operation derives socalled "interpolated precision" and is widely used for evaluating information retrieval systems. The interpolated precision of a given recall point is defined to be the maximum precision for that recall point and any higher recall point. This ensures that precision values are nonincreasing with increased recall. For the example above, because 0.60 precision is found at 0.75 recall, the interpolated precision of all recall levels lower than 0.75 is 0.60. This method implements this interpolation by only returning points that are not dominated by other points that have both better precision and recall: In the diagram, these are the yellow highlighted precision points.
It is common to also see this graph completed with points (0,1) and (1,0), but this function does not include these limits. The one hundred percent precision implied by the first point is not necessarily achievable, whereas the second point will be no better than the last point in the return result.
Neither interpolated nor uninterpolated return values are guaranteed to be convex. Convex closure will skew results upward in an even more unrealistic direction, especially if the artificial completion point (0,1) is included.
interpolate
 Set to true
if the precisions
are interpolated through pruning dominated points.
public double[][] prScoreCurve(boolean interpolate)
prCurve(boolean)
.
In the example in the class documentation above, the scores are provided in the second column.
interpolate
 Set to true
if the precisions
are interpolated through pruning dominated points.
public double[][] rocCurve(boolean interpolate)
TP/(TP+FN)
and rejection recall (sensitivity)
is its negative dual TN/(TN+FP)
, where
TP
is the true positive count, FP
the
false positive, FN
the false negative and
TN
the true negative counts. Through sensitivity,
the ROC curve provides information about rejection.
The last column in the example in prCurve(boolean)
provides the rejection recall rates
at each threshold. The resulting ROC curves for that example
are:
As with the recallprecision curve, the parameter determines
whether or not to "interpolate" the rejection recall
values. This is carried out as with the recallprecision curve
by only returning values which would not be interpolated. In
general, without interpolation, the same rows of the table are
used as for the recallprecision curve, namely those at the end
of a run of true positives. Interpolation may result in a
different set of recall points in the pruned answer set, as in
the example above.
Like the recallprecision curve method, this method does not insert artificial end ponits of (0,1) and (1,0) into the graph. As with the recallprecision curve, the final entry will have recall equal to one.
Neither interpolated nor uninterpolated return values are guaranteed to be convex. Convex closure will skew results upward in an even more unrealistic direction, especially if the artificial completion point (0,1) is included.
interpolate
 If true
, any point with both
precision and recall lower than another point is eliminated
from the returned precisionrecall curve.
public double maximumFMeasure()
2*recall*precision/(recall+precision)
.
For the example in prCurve(boolean)
:
maximumFMeasure("foo") = 0.67
corresponding to recall=0.75 and precision=0.60.
public double maximumFMeasure(double beta)
2*recall*precision/(recall+precision)
.
For the example in prCurve(boolean)
:
maximumFMeasure("foo") = 0.67
corresponding to recall=0.75 and precision=0.60.
public double prBreakevenPoint()
For the example illustrated in prCurve(boolean)
,
the breakeven point is 0.60. This is because the interpolated
precision recall curve is flat from the implicit initial point
(0.00,0.60)
to (0.75,0.60)
and thus
the line between them has a breakeven point of x = y =
0.6
.
As an interpolation (equal precision and recall) of a rounded up estimate (interpolated recallprecision curve), the breakeven point is not necessarily an achievable operating point. Note that the recallprecision breakeven point will always be smaller than the maximum F measure, which does correspond to an observed operating point, because the breakeven point always involves lowering the recall of the first point on the curve with recall greater than precision to match the precision.
This method will return 0.0
if the
precisionrecall curve never crosses the diagonal.
public double averagePrecision()
prCurve(boolean)
for a definition of the values on the curve.
This method implements the standard information retrieval definition, which only averages precision measurements from correct responses.
For the example provided in prCurve(boolean)
, the
average precision is the average of precision values for the
correct responses (highlighted lines):
Although the reasoning is different, the average precision returned is the same as the area under the uninterpolated recallprecision graph.
public double precisionAt(int rank)
Double.NaN
is returned.
public double reciprocalRank()
Typically, the mean of the reciprocal ranks for a number of evaluations is reported.
public double areaUnderPrCurve(boolean interpolate)
For the example detailed in prCurve(boolean)
, the areas without and with
interpolation are:
Interpolation will always result in an equal or greater area.
Note that the uninterpolated area under the recallprecision curve is the same as the average precision value.
interpolate
 Set to true
to interpolate
the precision values.
public double areaUnderRocCurve(boolean interpolate)
interpolate
 Set to true
to interpolate
the rejection recall values.
public String toString()
toString
in class Object
public static void printPrecisionRecallCurve(double[][] prCurve, PrintWriter pw)
prCurve(boolean)
: an array of length2 arrays of doubles.
In each length2 array, the recall value is at index 0, and the precision
is at index 1. The printed curve prints 3 columns in the following order:
precision, recall, Fmeasure.
prCurve
 A precisionrecall curve.pw
 The output PrintWriter.


PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 