com.aliasi.chunk
Class ChunkerEvaluator

java.lang.Object
  extended by com.aliasi.chunk.ChunkerEvaluator
All Implemented Interfaces:
Handler, ObjectHandler<Chunking>, TagHandler

public class ChunkerEvaluator
extends Object
implements TagHandler, ObjectHandler<Chunking>

The ChunkerEvaulator class provides an evaluation framework for chunkers. An instance of this class is constructed based on the chunker to be evaluated. This class implements the ObjectHandler<Chunking> interface in order to receive reference chunkings. Reference chunkings may be added directly using the handle(Chunking) or by passing this handler to an appropriate parser. Either way, the sequence is extracted from the reference chunking, the contained chunker is used to generate a response chunking, and then the reference and response chunkings are added to a contained ChunkingEvaluation which maintains a running score. The method evaluation() returns the contained chunking evaluation, which may be inspected for partial results at any time.

Thread Safety

Evaluators are not thread safe. In order to ensure thread safety, read/write synchronization is required on the methods. Read methods return scores, write methods alter the evaluator's state, either by changing the underlying chunker or adding examples.

Since:
LingPipe2.1
Version:
3.9.1
Author:
Bob Carpenter

Constructor Summary
ChunkerEvaluator(Chunker chunker)
          Construct an evaluator for the specified chunker.
 
Method Summary
 Chunker chunker()
          Returns the underlying chunker for this evaluator.
 ScoredPrecisionRecallEvaluation confidenceEvaluation()
          Returns the scored precision-recall evaluation derived from a confidence-based chunker.
 ChunkingEvaluation evaluation()
          Return the first-best chunking evaluation.
 void handle(Chunking referenceChunking)
          Handle the specified reference chunking.
 void handle(String[] tokens, String[] whitespaces, String[] tags)
          Deprecated. Wrap with BioTagChunkCodec instead.
 String lastConfidenceCaseReport()
          Returns a string-based representation of the last evaluation case's confidence evaluation.
 String lastFirstBestCaseReport()
          Returns a string-based representation of the last evaluation case and the first-best result.
 String lastNBestCaseReport()
          Returns a string-based representation of the last n-best evaluation case.
 ObjectToCounterMap<Integer> nBestEvaluation()
          Returns the n-best evaluation in the form of a mapping from ranks to the number of times the reference chunking was that rank in the evaluation.
 void setChunker(Chunker chunker)
          Set the underlying chunker to the specified value.
 void setMaxConfidenceChunks(int n)
          Sets the maximum number of chunks extracted by a confidence-based chunker for evaluation.
 void setMaxNBest(int n)
          Sets the maximum number of chunkings extracted by an n-best chunker for evaluation.
 void setMaxNBestReport(int n)
          Sets the maximum number of chunkings that will be reported in a case report.
 void setVerbose(boolean isVerbose)
          Sets the verbosity level of this evaluator to the specified value.
 String toString()
          Returns a string-based representation of this evaluation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ChunkerEvaluator

public ChunkerEvaluator(Chunker chunker)
Construct an evaluator for the specified chunker.

Parameters:
chunker - Chunker to evaluate.
Method Detail

chunker

public Chunker chunker()
Returns the underlying chunker for this evaluator.

Returns:
The underlying chunker.

setChunker

public void setChunker(Chunker chunker)
Set the underlying chunker to the specified value.

Parameters:
chunker - New underlying chunker for this evaluator.

setVerbose

public void setVerbose(boolean isVerbose)
Sets the verbosity level of this evaluator to the specified value. If the argument is true, calls to handle(Chunking) will print (to System.out) a report for each chunking evaluation (first-best, n-best and confidence).

The reports that are written are also available as strings programmatically through the methods lastNBestCaseReport(), and lastConfidenceCaseReport(),

Parameters:
isVerbose - true for standard output per case.

lastFirstBestCaseReport

public String lastFirstBestCaseReport()
Returns a string-based representation of the last evaluation case and the first-best result.

Returns:
The first-best report for the last case handled.

setMaxConfidenceChunks

public void setMaxConfidenceChunks(int n)
Sets the maximum number of chunks extracted by a confidence-based chunker for evaluation.

Parameters:
n - Number of chunks to extract with confidence.

lastConfidenceCaseReport

public String lastConfidenceCaseReport()
Returns a string-based representation of the last evaluation case's confidence evaluation. If there has not been an evaluation case or the chunker being evaluated is not a confidence-based chunker, this result will be null.

Returns:
A string representation of the last case's confidence evaluation.

setMaxNBest

public void setMaxNBest(int n)
Sets the maximum number of chunkings extracted by an n-best chunker for evaluation.

Parameters:
n - Number of chunkings to evaluate in n-best chunking.

setMaxNBestReport

public void setMaxNBestReport(int n)
Sets the maximum number of chunkings that will be reported in a case report. That is, chunkings reported through a call to the the lastNBestCaseReport() method.

Parameters:
n - Number of n-best results to print in a case report.

lastNBestCaseReport

public String lastNBestCaseReport()
Returns a string-based representation of the last n-best evaluation case.

Returns:
String representing the last n-best case evaluation.

handle

@Deprecated
public void handle(String[] tokens,
                              String[] whitespaces,
                              String[] tags)
Deprecated. Wrap with BioTagChunkCodec instead.

Handle the specified reference chunking encoded in the standard BIO tag chunking format. If the whitespaces are null, a single space character is used to separate tokens.

See handle(Chunking) for more information.

Specified by:
handle in interface TagHandler
Parameters:
tokens - Array of tokens.
whitespaces - Array of whitespaces.
tags - Array of tags.

handle

public void handle(Chunking referenceChunking)
Handle the specified reference chunking. This involves running the chunker being evaluated over the reference chunking's sequence to create a response chunking, which is then added with the reference chunking as a case to the chunking evaluation.

If the contained chunker returns null for a given input, this method will fill in a chunking over the appropriate sequence with no chunks for evaluation.

Specified by:
handle in interface ObjectHandler<Chunking>
Parameters:
referenceChunking - The reference chunking case.

confidenceEvaluation

public ScoredPrecisionRecallEvaluation confidenceEvaluation()
Returns the scored precision-recall evaluation derived from a confidence-based chunker. If the chunker being evaluated is not a confidence-based chunker, then this evaluation will be empty.

This is the actual evaluation used by this class, so changing it will affect this class's results.

Returns:
The scored precision/recall evaluation.

evaluation

public ChunkingEvaluation evaluation()
Return the first-best chunking evaluation.

This is the actual evaluation used by this class, so changing it will affect this class's results.

Returns:
The chunking evaluation.

nBestEvaluation

public ObjectToCounterMap<Integer> nBestEvaluation()
Returns the n-best evaluation in the form of a mapping from ranks to the number of times the reference chunking was that rank in the evaluation. The ranks are instances of Integer, with -1 being the rank assigned to cases in which the reference chunking was not among the n-best results.

This is the actual counter used by this class, so changing it will affect this class's results.

If the chunker being evaluated is not an n-best chunker, then this evaluation will be empty.

Returns:
The n-best evaluation.

toString

public String toString()
Returns a string-based representation of this evaluation. It will include the first-best evaluation. An n-best evaluation and/or a confidence evaluation are included if defined.

Overrides:
toString in class Object
Returns:
A string-based representation of this evaluator.