com.aliasi.tag
Class NBestTaggerEvaluator<E>

java.lang.Object
  extended by com.aliasi.tag.NBestTaggerEvaluator<E>
Type Parameters:
E - Type of tokens in the tagging.
All Implemented Interfaces:
Handler, ObjectHandler<Tagging<E>>

public class NBestTaggerEvaluator<E>
extends Object
implements ObjectHandler<Tagging<E>>

An NBestTaggerEvaluator provides an evaluation framework for n-best taggers.

Test cases may be added directly using the addCase(Tagging,Iterator) method, which accepts a reference gold-standard tagging and a system response consisting of an iterator such as the result produced by an n-best tagger.

A specific tagger may be supplied to the constructor or set using the setTagger(NBestTagger) method. Test cases may be supplied to the tagger through the object handler method handle(Tagging), which accepts a gold-standard reference tagging. The tagger is then used to produce the system response which is then added as a test case.

The main n-best evaluation is the histogram returned by nBestHistogram(), which provides counts for the number of times the correct reference result was found in the response at a particular rank.

The method recallAtN() returns an array of recall values indexed by the rank of results. For instance, recallAtN()[0] is the percentage of cases for which the first-best result was correct, recallAtN()[1] is the percentage of cases where the first-best result was returned as the first or second result (ranks 0 or 1).

A string-based representation of the last case that was evaluated is available through lastCaseToString(int). The report is based on a set of known tokens, which is up to the evaluation client to provide; null values evaluate without previously known tokens being used.

Thread Safety

An n-best tagger evaluator must be read-write synchronized. The write methods are handle(), addCase(), and setTagger().

Since:
LingPipe3.9
Version:
3.9
Author:
Bob Carpenter

Constructor Summary
NBestTaggerEvaluator(NBestTagger<E> tagger, int maxNBest, int maxNBestToString)
          Construct an n-best tagger evaluator using the specified tagger, restricting the response taggings to the maximum number of outputs specified, and writing the specified number of outputs to the last case string.
 
Method Summary
 void addCase(Tagging<E> referenceTagging, Iterator<ScoredTagging<E>> responseTaggingIterator)
          Add a test case consisting of the specified reference tagging and iterator over responses.
 void handle(Tagging<E> referenceTagging)
          Add the specified reference tagging as a test case, with a response tagging computed by the contained n-best tagger.
 String lastCaseToString(int maxNBestReport)
          Return a string-based representation of the last case evaluated.
 int maxNBest()
          Returns the maximum number of results examined in the response for each test case.
 ObjectToCounterMap<Integer> nBestHistogram()
          Return the histogram of results mapping ranks to the number of test cases where the correct result was at that rank.
 int numCases()
          Return the number of test cases in this evaluation.
 long numTokens()
          Return the total number of tokens in all test cases for this evaluator.
 double[] recallAtN()
          Return an array of recall values indexed by rank.
 void setTagger(NBestTagger<E> tagger)
          Set the tagger to the specified value.
 NBestTagger<E> tagger()
          Return the n-best tagger used for this evaluator.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NBestTaggerEvaluator

public NBestTaggerEvaluator(NBestTagger<E> tagger,
                            int maxNBest,
                            int maxNBestToString)
Construct an n-best tagger evaluator using the specified tagger, restricting the response taggings to the maximum number of outputs specified, and writing the specified number of outputs to the last case string.

Parameters:
tagger - Tagger to evaluate, or null if cases are added directly or the tagger is set later.
maxNBest - Maximum number of n-best results in the system response to evaluate.
maxNBestToString - Maximum number of n-best results to write to the string output for the last case.
Method Detail

maxNBest

public int maxNBest()
Returns the maximum number of results examined in the response for each test case.

Returns:
Maximum n-best explored in responses.

setTagger

public void setTagger(NBestTagger<E> tagger)
Set the tagger to the specified value.

Parameters:
tagger - Tagger to use for evaluation.

tagger

public NBestTagger<E> tagger()
Return the n-best tagger used for this evaluator.

Returns:
The tagger being evaluated.

handle

public void handle(Tagging<E> referenceTagging)
Add the specified reference tagging as a test case, with a response tagging computed by the contained n-best tagger.

Specified by:
handle in interface ObjectHandler<Tagging<E>>
Parameters:
referenceTagging - Reference tagging to evaluate.

addCase

public void addCase(Tagging<E> referenceTagging,
                    Iterator<ScoredTagging<E>> responseTaggingIterator)
Add a test case consisting of the specified reference tagging and iterator over responses.

Parameters:
referenceTagging - Reference gold-standard tagging.
responseTaggingIterator - System response as an iterator over taggings.

nBestHistogram

public ObjectToCounterMap<Integer> nBestHistogram()
Return the histogram of results mapping ranks to the number of test cases where the correct result was at that rank.

Returns:
Histogram of n-best result ranks.

recallAtN

public double[] recallAtN()
Return an array of recall values indexed by rank. The recall at rank n is defined as the percentage of test cases in which the correct result was found at rank n or better in the result.

Returns:
Array of recall at N values.

numCases

public int numCases()
Return the number of test cases in this evaluation.

Returns:
Number of test cases.

numTokens

public long numTokens()
Return the total number of tokens in all test cases for this evaluator.

Returns:
Number of tokens tested.

lastCaseToString

public String lastCaseToString(int maxNBestReport)
Return a string-based representation of the last case evaluated. The n-best results will be printed up to the number of results specified in the argument, limited by the maximum n-best evaluated in the constructor.

Parameters:
maxNBestReport - Maximum number of n-best results to report.
Returns:
String-based representation of the last case.