## com.aliasi.classify Class PrecisionRecallEvaluation

```java.lang.Object
com.aliasi.classify.PrecisionRecallEvaluation
```

`public class PrecisionRecallEvaluationextends Object`

A `PrecisionRecallEvaluation` collects and reports a suite of descriptive statistics for binary classification tasks. The basis of a precision recall evaluation is a matrix of counts of reference and response classifications. Each cell in the matrix corresponds to a method returning a long integer count.

 Response Reference Totals true false Refer-ence true `truePositive()` (TP) `falseNegative()` (FN) `positiveReference()` (TP+FN) false `falsePositive()` (FP) `trueNegative()` (TN) `negativeReference()` (FP+TN) Response Totals `positiveResponse()` (TP+FP) `negativeResponse()` (FN+TN) `total()` (TP+FN+FP+TN)
The most basic statistic is accuracy, which is the number of correct responses divided by the total number of cases.
``` accuracy()``` = correct() / total()
This class derives its name from the following four statistics, which are illustrated in the four tables.
``` recall() = truePositive() / positiveReference() ```
``` precision() = truePositive() / positiveResponse() ```
``` rejectionRecall() = trueNegative() / negativeReference() ```
``` rejectionPrecision() = trueNegative() / negativeResponse() ```
Each measure is defined to be the green count divided by the green plus red count in the corresponding table:
 Recall Response True False Refer-ence True + - False
 Precision Response True False Refer-ence True + False -
 Rejection Recall Response True False Refer-ence True False - +
 Rejection Precision Response True False Refer-ence True - False +
This picture clearly illustrates the relevant dualities. Precision is the dual to recall if the reference and response are switched (the matrix is transposed). Similarly, rejection recall is dual to recall with true and false labels switched (reflection around each axis in turn); rejection precision is similarly dual to precision.

Precision and recall may be combined by weighted geometric averaging by using the f-measure statistic, with `β` between 0 and infinity being the relative weight of precision, with 1 being a neutral value.

``` fMeasure() = fMeasure(1) ```
``` fMeasure(β) = (1 + β2) * precision() * recall() / (recall() + β2 * precision()) ```

There are four traditional measures of binary classification, which are as follows.

``` fowlkesMallows() = truePositive() / (precision() * recall())(1/2) ```
``` jaccardCoefficient() = truePositive() / (total() - trueNegative()) ```
``` yulesQ() = (truePositive() * trueNegative() - falsePositive() * falseNegative()) / (truePositive() * trueNegative() + falsePositive() * falsePositive()) ```
``` yulesY() = ((truePositive() * trueNegative())(1/2) - (falsePositive() * falseNegative())(1/2)) / ((truePositive() * trueNegative())(1/2) + (falsePositive() * falsePositive())(1/2)) ```

Replacing precision and recall with their definitions, `TP/(TP+FP)` and `TP/(TP+FN)`:

```      F1
= 2 * (TP/(TP+FP)) * (TP/(TP+FN))
/ (TP/(TP+FP) + TP/(TP+FN))
= 2 * (TP*TP / (TP+FP)(TP+FN))
/ (TP*(TP+FN)/(TP+FP)(TP+FN) + TP*(TP+FP)/(TP+FN)(TP+FP))
= 2 * (TP / (TP+FP)(TP+FN))
/ ((TP+FN)/(TP+FP)(TP+FN) + (TP+FP)/(TP+FN)(TP+FP))
= 2 * TP /
/ ((TP+FN) + (TP+FP))
= 2*TP / (2*TP + FP + FN)```
Thus the F1-measure is very closely related to the Jaccard coefficient, `TP/(TP+FP+FN)`. Like the Jaccard coefficient, the F measure does not vary with varying true negative counts. Rejection precision and recall do vary with changes in true negative count.

Basic reference and response likelihoods are computed by frequency.

``` referenceLikelihood() = positiveReference() / total() ```
``` responseLikelihood() = positiveResponse() / total() ```
An algorithm that chose responses at random according to the response likelihood would have the following accuracy against test cases chosen at random according to the reference likelihood:
``` randomAccuracy() = referenceLikelihood() * responseLikelihood() + (1 - referenceLikelihood()) * (1 - responseLikelihood()) ```
The two summands arise from the likelihood of true positive and the likelihood of a true negative. From random accuracy, the κ-statistic is defined by dividing out the random accuracy from the accuracy, in some way giving a measure of performance above a baseline expectation.
``` kappa() = kappa(accuracy(),randomAccuracy()) ```
``` kappa(p,e) = (p - e) / (1 - e) ```

There are two alternative forms of the κ-statistic, both of which attempt to correct for putative bias in the estimation of random accuracy. The first involves computing the random accuracy by taking the average of the reference and response likelihoods to be the baseline reference and response likelihood, and squaring the result to get the so-called unbiased random accuracy and the unbiased κ-statistic:

``` randomAccuracyUnbiased() = avgLikelihood()2 + (1 - avgLikelihood())2 avgLikelihood() = (referenceLikelihood() + responseLikelihood()) / 2 ```
``` kappaUnbiased() = kappa(accuracy(),randomAccuracyUnbiased()) ```

Kappa can also be adjusted for the prevalence of positive reference cases, which leads to the following simple definition:

``` kappaNoPrevalence() = (2 * accuracy()) - 1 ```

Pearson's C2 statistic is provided by the following method:

``` chiSquared() = total() * phiSquared() ```
``` phiSquared() = ((truePositive()*trueNegative()) * (falsePositive()*falseNegative()))2 / ((truePositive()+falseNegative()) * (falsePositive()+trueNegative()) * (truePositive()+falsePositive()) * (falseNegative()+trueNegative())) ```

The accuracy deviation is the deviation of the average number of positive cases in a binomial distribution with accuracy equal to the classification accuracy and number of trials equal to the total number of cases.

``` accuracyDeviation() = (accuracy() * (1 - accuracy()) / total())(1/2) ```
This number can be used to provide error intervals around the accuracy results.

Using the following three tables as examples:

 Cab-vs-All Response Cab Other Refer-ence Cab 9 3 Other 4 11
 Syrah-vs-All Response Syrah Other Refer-ence Syrah 5 4 Other 4 14
 Pinot-vs-All Response Pinot Other Refer-ence Pinot 4 2 Other 1 20
The various statistics evaluate to the following values:
 Method Cabernet Syrah Pinot `positiveReference()` 12 9 6 `negativeReference()` 15 18 21 `positiveResponse()` 13 9 5 `negativeResponse()` 14 18 22 `correctResponse()` 20 19 24 `total()` 27 27 27 `accuracy()` 0.7407 0.7037 0.8889 `recall()` 0.7500 0.5555 0.6666 `precision()` 0.6923 0.5555 0.8000 `rejectionRecall()` 0.7333 0.7778 0.9524 `rejectionPrecision()` 0.7858 0.7778 0.9091 `fMeasure()` 0.7200 0.5555 0.7272 `fowlkesMallows()` 12.49 9.00 5.48 `jaccardCoefficient()` 0.5625 0.3846 0.5714 `yulesQ()` 0.7838 0.6279 0.9512 `yulesY()` 0.4835 0.3531 0.7269 `referenceLikelihood()` 0.4444 0.3333 0.2222 `responseLikelihood()` 0.4815 0.3333 0.1852 `randomAccuracy()` 0.5021 0.5556 0.6749 `kappa()` 0.4792 0.3333 0.6583 `randomAccuracyUnbiased()` 0.5027 0.5556 0.6756 `kappaUnbiased()` 0.4789 0.3333 0.6575 `kappaNoPrevalence()` 0.4814 0.4074 0.7778 `chiSquared()` 6.2382 3.0000 11.8519 `phiSquared()` 0.2310 0.1111 0.4390 `accuracyDeviation()` 0.0843 0.0879 0.0605

Since:
LingPipe2.1
Version:
2.1
Author:
Bob Carpenter

Constructor Summary
`PrecisionRecallEvaluation()`
Construct a precision-recall evaluation with all counts set to zero.
```PrecisionRecallEvaluation(long tp, long fn, long fp, long tn)```
Construction a precision-recall evaluation initialized with the specified counts.

Method Summary
` double` `accuracy()`
Returns the sample accuracy of the responses.
` double` `accuracyDeviation()`
Returns the standard deviation of the accuracy.
` void` ```addCase(boolean reference, boolean response)```
Adds a case with the specified reference and response classifications.
` double` `chiSquared()`
Returns the χ2 value.
` long` `correctResponse()`
Returns the number of cases where the response is correct.
` long` `falseNegative()`
Returns the number of false negative cases.
` long` `falsePositive()`
Returns the number of false positive cases.
` double` `fMeasure()`
Returns the F1 measure.
` double` `fMeasure(double beta)`
Returns the `Fβ` value for the specified `β`.
`static double` ```fMeasure(double beta, double recall, double precision)```
Returns the Fβ measure for a specified β, recall and precision values.
` double` `fowlkesMallows()`
Return the Fowlkes-Mallows score.
` long` `incorrectResponse()`
Returns the number of cases where the response is incorrect.
` double` `jaccardCoefficient()`
Returns the Jaccard coefficient.
` double` `kappa()`
Returns the value of the kappa statistic.
` double` `kappaNoPrevalence()`
Returns the value of the kappa statistic adjusted for prevalence.
` double` `kappaUnbiased()`
Returns the value of the unbiased kappa statistic.
` long` `negativeReference()`
Returns the number of negative reference cases.
` long` `negativeResponse()`
Returns the number of negative response cases.
` double` `phiSquared()`
Returns the φ2 value.
` long` `positiveReference()`
Returns the number of positive reference cases.
` long` `positiveResponse()`
Returns the number of positive response cases.
` double` `precision()`
Returns the precision.
` double` `randomAccuracy()`
The probability that the reference and response are the same if they are generated randomly according to the reference and response likelihoods.
` double` `randomAccuracyUnbiased()`
The probability that the reference and the response are the same if the reference and response likelihoods are both the average of the sample reference and response likelihoods.
` double` `recall()`
Returns the recall.
` double` `referenceLikelihood()`
Returns the sample reference likelihood, which is the number of positive references divided by the total number of cases.
` double` `rejectionPrecision()`
Returns the rejection prection, or selectivity, value.
` double` `rejectionRecall()`
Returns the rejection recall, or specificity, value.
` double` `responseLikelihood()`
Returns the sample response likelihood, which is the number of positive responses divided by the total number of cases.
` String` `toString()`
Returns a string-based representation of this evaluation.
` long` `total()`
Returns the total number of cases.
` long` `trueNegative()`
Returns the number of true negative cases.
` long` `truePositive()`
Returns the number of true positive cases.
` double` `yulesQ()`
Return the value of Yule's Q statistic.
` double` `yulesY()`
Return the value of Yule's Y statistic.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`

Constructor Detail

### PrecisionRecallEvaluation

`public PrecisionRecallEvaluation()`
Construct a precision-recall evaluation with all counts set to zero.

### PrecisionRecallEvaluation

```public PrecisionRecallEvaluation(long tp,
long fn,
long fp,
long tn)```
Construction a precision-recall evaluation initialized with the specified counts.

Parameters:
`tp` - True positive count.
`fn` - False negative count.
`fp` - False positive count.
`tn` - True negative count.
Throws:
`IllegalArgumentException` - If any of the counts are negative.
Method Detail

```public void addCase(boolean reference,
boolean response)```
Adds a case with the specified reference and response classifications.

Parameters:
`reference` - Reference classification.
`response` - Response classification.

### truePositive

`public long truePositive()`
Returns the number of true positive cases. A true positive is where both the reference and response are true.

Returns:
The number of true positives.

### falsePositive

`public long falsePositive()`
Returns the number of false positive cases. A false positive is where the reference is false and response is true.

Returns:
The number of false positives.

### trueNegative

`public long trueNegative()`
Returns the number of true negative cases. A true negative is where both the reference and response are false.

Returns:
The number of true negatives.

### falseNegative

`public long falseNegative()`
Returns the number of false negative cases. A false negative is where the reference is true and response is false.

Returns:
The number of false negatives.

### positiveReference

`public long positiveReference()`
Returns the number of positive reference cases. A positive reference case is one where the reference is true.

Returns:
The number of positive references.

### negativeReference

`public long negativeReference()`
Returns the number of negative reference cases. A negative reference case is one where the reference is false.

Returns:
The number of negative references.

### referenceLikelihood

`public double referenceLikelihood()`
Returns the sample reference likelihood, which is the number of positive references divided by the total number of cases.

Returns:
The sample reference likelihood.

### positiveResponse

`public long positiveResponse()`
Returns the number of positive response cases. A positive response case is one where the response is true.

Returns:
The number of positive responses.

### negativeResponse

`public long negativeResponse()`
Returns the number of negative response cases. A negative response case is one where the response is false.

Returns:
The number of negative responses.

### responseLikelihood

`public double responseLikelihood()`
Returns the sample response likelihood, which is the number of positive responses divided by the total number of cases.

Returns:
The sample response likelihood.

### correctResponse

`public long correctResponse()`
Returns the number of cases where the response is correct. A correct response is one where the reference and response are the same.

Returns:
The number of correct responses.

### incorrectResponse

`public long incorrectResponse()`
Returns the number of cases where the response is incorrect. An incorrect response is one where the reference and response are different.

Returns:
The number of incorrect responses.

### total

`public long total()`
Returns the total number of cases.

Returns:
The total number of cases.

### accuracy

`public double accuracy()`
Returns the sample accuracy of the responses. The accuracy is just the number of correct responses divided by the total number of respones.

Returns:
The sample accuracy.

### recall

`public double recall()`
Returns the recall. The recall is the number of true positives divided by the number of positive references. This is the fraction of positive reference cases that were found by the classifier.

Returns:
The recall value.

### precision

`public double precision()`
Returns the precision. The precision is the number of true positives divided by the number of positive respones. This is the fraction of positive responses returned by the classifier that were correct.

Returns:
The precision value.

### rejectionRecall

`public double rejectionRecall()`
Returns the rejection recall, or specificity, value. The rejection recall is the percentage of negative references that had negative respones.

Returns:
The rejection recall value.

### rejectionPrecision

`public double rejectionPrecision()`
Returns the rejection prection, or selectivity, value. The rejection precision is the percentage of negative responses that were negative references.

Returns:
The rejection precision value.

### fMeasure

`public double fMeasure()`
Returns the F1 measure. This is the result of applying the method `fMeasure(double)` to `1`. of the method

Returns:
The F1 measure.

### fMeasure

`public double fMeasure(double beta)`
Returns the `Fβ` value for the specified `β`.

Parameters:
`beta` - The `β` parameter.
Returns:
The `Fβ` value.

### jaccardCoefficient

`public double jaccardCoefficient()`
Returns the Jaccard coefficient.

Returns:
The Jaccard coefficient.

### chiSquared

`public double chiSquared()`
Returns the χ2 value.

Returns:
The χ2 value.

### phiSquared

`public double phiSquared()`
Returns the φ2 value.

Returns:
The φ2 value.

### yulesQ

`public double yulesQ()`
Return the value of Yule's Q statistic.

Returns:
The value of Yule's Q statistic.

### yulesY

`public double yulesY()`
Return the value of Yule's Y statistic.

Returns:
The value of Yule's Y statistic.

### fowlkesMallows

`public double fowlkesMallows()`
Return the Fowlkes-Mallows score.

Returns:
The Fowlkes-Mallows score.

### accuracyDeviation

`public double accuracyDeviation()`
Returns the standard deviation of the accuracy. This is computed as the deviation of an equivalent accuracy generated by a binomial distribution, which is just a sequence of Bernoulli (binary) trials.

Returns:
The standard deviation of the accuracy.

### randomAccuracy

`public double randomAccuracy()`
The probability that the reference and response are the same if they are generated randomly according to the reference and response likelihoods.

Returns:
The accuracy of a random classifier.

### randomAccuracyUnbiased

`public double randomAccuracyUnbiased()`
The probability that the reference and the response are the same if the reference and response likelihoods are both the average of the sample reference and response likelihoods.

Returns:
The unbiased random accuracy.

### kappa

`public double kappa()`
Returns the value of the kappa statistic.

Returns:
The value of the kappa statistic.

### kappaUnbiased

`public double kappaUnbiased()`
Returns the value of the unbiased kappa statistic.

Returns:
The value of the unbiased kappa statistic.

### kappaNoPrevalence

`public double kappaNoPrevalence()`
Returns the value of the kappa statistic adjusted for prevalence.

Returns:
The value of the kappa statistic adjusted for prevalence.

### toString

`public String toString()`
Returns a string-based representation of this evaluation.

Overrides:
`toString` in class `Object`
Returns:
A string-based representation of this evaluation.

### fMeasure

```public static double fMeasure(double beta,
double recall,
double precision)```
Returns the Fβ measure for a specified β, recall and precision values.

Parameters:
`beta` - Relative weighting of precision.
`recall` - Recall value.
`precision` - Precision value.
Returns:
The Fβ measure.