com.aliasi.hmm
Class HmmEvaluation

java.lang.Object
  extended by com.aliasi.hmm.HmmEvaluation

Deprecated. Use TaggerEvaluator, MarginalTaggerEvaluator and NBestTaggerEvaluator instead.

@Deprecated
public class HmmEvaluation
extends Object

An HmmEvaluation stores and reports the results for evaluating hidden Markov models. There are methods providing for adding test cases (with results) and for various means of reporting.

The top-level addCase(String[],String[],String[],TagWordLattice,Iterator) adds a complete case in the form of tokens, reference tags, first-best response tags, a tag-word lattice of confidence estimates for tags, and an iterator over the n-best list. All of these are available as outputs from an HmmDecoder. If this method is used for all cases, then all reports will be complete. If it is not used for all cases, then the results will not be complete. For instance, if addFirstBestCase(String[],String[],String[]) is called directly, then it only adds results for the first-best evaluation, and only the first-best evaluation results will be relevant.

Results are available in the form of two different clasifier evaluations. The method firstBestEvaluation() returns the evaluation of the first-best results as a first-best classifier evaluation on a token-by-token basis. The method confidenceEvaluation() returns the confidence-based evaluation in the form of a joint probability classifier evaluation.

The results of the n-best decoder are available as a histogram through nBestHistogram(). This histogram maps ranks to the number of cases for which the correct result was of that rank in the n-best list. For instance, if the reference tagging was the 7th-best result returned by the n-best iterator on three occassions, then the n-best histogram maps the Integer 7 to the count 3.

The method caseAccuracy() returns the percentage of cases for which the first-best answer has been completely correct. This makes most sense when the cases are coherent units, such as sentences.

First-best accuracy for unknown words is available through the method unknownTokenAccuracy(). The set of known tokens is available through the method knownTokenSet(). This set begins empty after construction. Tokens may be added to this set through the method addKnownToken(String).

Since:
LingPipe2.1
Version:
3.9
Author:
Bob Carpenter

Constructor Summary
HmmEvaluation(String[] tags, int maxNBest)
          Deprecated. Construct a hidden Markov model evaluation with the specified depth of n-best evaluation.
 
Method Summary
 void addCase(String[] tokens, String[] referenceTags, String[] responseTags, TagWordLattice lattice, Iterator<ScoredObject<String[]>> nBestIterator)
          Deprecated. See class documentation.
 void addFirstBestCase(String[] tokens, String[] referenceTags, String[] responseTags)
          Deprecated. Adds a first-best response case with the specified tokens, reference tags, and first-best response tags.
 void addKnownToken(String token)
          Deprecated. Adds the specified token to the set of known tokens.
 void addLatticeCase(String[] tokens, String[] referenceTags, TagWordLattice lattice)
          Deprecated. See class documentation.
 void addNBestCase(String[] tokens, String[] referenceTags, Iterator<ScoredObject<String[]>> nBestIterator)
          Deprecated. Add an n-best response case with the specified tokens, reference tags and n-best iterator.
 double caseAccuracy()
          Deprecated. Returns the accuracy measured over entire cases.
 ClassifierEvaluator<String,JointClassification> confidenceEvaluation()
          Deprecated. See class documentation.
 ClassifierEvaluator<String,Classification> firstBestEvaluation()
          Deprecated. See class documentation.
 Set<String> knownTokenSet()
          Deprecated. Returns the set of known tokens for this evaluation.
 int maxNBest()
          Deprecated. Returns the maximum n-best result searched.
 ObjectToCounterMap<Integer> nBestHistogram()
          Deprecated. Return the histogram of n-best ranks of the reference tagging in the first-best responses.
 long numCases()
          Deprecated. Returns the number of cases making up this evaluation.
 long numTokens()
          Deprecated. Returns the number of tokens making up this evaluation.
 String toString()
          Deprecated. Returns a terse, one-line report of the current state of this evaluation.
 double unknownTokenAccuracy()
          Deprecated. Returns the first-best accuracy for unknown tokens.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

HmmEvaluation

public HmmEvaluation(String[] tags,
                     int maxNBest)
Deprecated. 
Construct a hidden Markov model evaluation with the specified depth of n-best evaluation. The n-best evaluation depth will determine how many entries of the n-best results are searched before giving up. High values for the n-best number may cause significant slowdowns in processing, especially for long input strings.

Parameters:
tags - Possible state tags output by the HMM.
maxNBest - Maximum n-best output to consider.
Method Detail

numCases

public long numCases()
Deprecated. 
Returns the number of cases making up this evaluation.


numTokens

public long numTokens()
Deprecated. 
Returns the number of tokens making up this evaluation.


maxNBest

public int maxNBest()
Deprecated. 
Returns the maximum n-best result searched.


firstBestEvaluation

@Deprecated
public ClassifierEvaluator<String,Classification> firstBestEvaluation()
Deprecated. See class documentation.

Returns the classifier evaluation derived from the first-best hypotheses. This is a first-best classifier evaluation.

Returns:
This evaluation's classifier evaluation.

confidenceEvaluation

@Deprecated
public ClassifierEvaluator<String,JointClassification> confidenceEvaluation()
Deprecated. See class documentation.

Returns the classifier evaluation derived from the tag-word lattice confidence scoring. The result is an evaluation with scores and meaningful ranked outputs such as precision-recall curves.

Returns:
The confidence evaluation for this HMM.

nBestHistogram

public ObjectToCounterMap<Integer> nBestHistogram()
Deprecated. 
Return the histogram of n-best ranks of the reference tagging in the first-best responses. The mapping is from Integer objects representing ranks to counts of the number of times the result of that rank was correct. The ranks will be greater than or equal to zero and less than the value of maxNBest(). In addition, the count assigned to maxNBest() itself will return the count of all cases that are greater than or equal to maxNBest().

Returns:
The n-best histogram for this evaluation.

addCase

@Deprecated
public void addCase(String[] tokens,
                               String[] referenceTags,
                               String[] responseTags,
                               TagWordLattice lattice,
                               Iterator<ScoredObject<String[]>> nBestIterator)
Deprecated. See class documentation.

Adds a complete response case for evaluation, consisting of the specified tokens, reference tags, first-best response tags, lattice of forward-backward confidence-based scores, and an iterator over the n-best list. If any of the last three values are null, then they will not be added to the evaluations.

Parameters:
tokens - The tokens for the evaluation.
referenceTags - The reference tagging.
responseTags - The response tagging.
Throws:
IllegalArgumentException - If the token and tag arrays are not the same length, or if the lattice is not over the specified token array.

addFirstBestCase

public void addFirstBestCase(String[] tokens,
                             String[] referenceTags,
                             String[] responseTags)
Deprecated. 
Adds a first-best response case with the specified tokens, reference tags, and first-best response tags. Note that this only adds information to the first-best evaluation, not the n-best or lattice-based evaluations.

Parameters:
tokens - The tokens for the evaluation.
referenceTags - The reference tagging.
responseTags - The response tagging.
Throws:
IllegalArgumentException - If the token, reference tag and response tag arrays are not all the same length.

caseAccuracy

public double caseAccuracy()
Deprecated. 
Returns the accuracy measured over entire cases. This is the number of evaluation cases that are completely correct divided by the number of cases evaluated. This number makes sense in cases where the cases correspond to meaningful units such as sentences.

Returns:
The first-best complete case tagging accuracy.

knownTokenSet

public Set<String> knownTokenSet()
Deprecated. 
Returns the set of known tokens for this evaluation. This set is immutable, but will reflect the current set of known tokens.

Returns:
The set of known tokens for this evaluation.

addKnownToken

public void addKnownToken(String token)
Deprecated. 
Adds the specified token to the set of known tokens.

Parameters:
token - Token to add to set of known tokens.

unknownTokenAccuracy

public double unknownTokenAccuracy()
Deprecated. 
Returns the first-best accuracy for unknown tokens. Unknown tokens are defined to be those not in the mutable set knownTokenSet() at the time the evaluation case was added.

Returns:
The first-best unknown token accuracy.

addLatticeCase

@Deprecated
public void addLatticeCase(String[] tokens,
                                      String[] referenceTags,
                                      TagWordLattice lattice)
Deprecated. See class documentation.

Add a lattice-based response case with the specified tokens, reference tags and lattice. Note that this only adds information to the lattice evaluation, not the first-best or n-best evaluations.

Parameters:
tokens - The tokens for the evaluation.
referenceTags - The reference tagging.
lattice - The response lattice.
Throws:
IllegalArgumentException - If the token and reference tag arrays are different lengths, or if the lattice tokens are not the same as the tokens.

addNBestCase

public void addNBestCase(String[] tokens,
                         String[] referenceTags,
                         Iterator<ScoredObject<String[]>> nBestIterator)
Deprecated. 
Add an n-best response case with the specified tokens, reference tags and n-best iterator. Note that this only adds information to the n-best evaluation, not the first-best or confidence-based lattice evaluations.

Parameters:
tokens - The tokens for the evaluation.
referenceTags - The reference tagging.
nBestIterator - The n-best iterator.
Throws:
IllegalArgumentException - If the token and reference tag arrays are different lengths.

toString

public String toString()
Deprecated. 
Returns a terse, one-line report of the current state of this evaluation.

Overrides:
toString in class Object
Returns:
A string representation of the state of this evaluation.