com.aliasi.hmm
Class TagWordLattice

java.lang.Object
  extended by com.aliasi.tag.TagLattice<String>
      extended by com.aliasi.hmm.TagWordLattice

Deprecated. Use TagLattice interface for return results, and ForwardBackwardTagLattice for construction.

@Deprecated
public class TagWordLattice
extends TagLattice<String>

A TagWordLattice encodes a lattice resulting from decoding a hidden Markov model (HMM). The lattice encodes the tokens used as input and the tag symbol table, as well as matrices for transition, forward and backward scores.

The lattice probabilities are factored into start, transition, and end probabilities. In general, the start, transition and forward probabilities include the emission probabilities of their destination tag. The backward probabilities include all emissions up to, but not including, the indexed node.

Since:
LingPipe2.1
Version:
3.9
Author:
Bob Carpenter

Constructor Summary
TagWordLattice(String[] tokens, SymbolTable tagSymbolTable, double[] startProbs, double[] endProbs, double[][][] transitProbs)
          Deprecated. Construct a tag-word lattice for the specified token inputs and the specified tag symbol table with the specified estimates.
 
Method Summary
 double backward(int tokenIndex, int tagId)
          Deprecated. Returns the backward probability up to the token of the specified index and for the tag of the specified identifier.
 String[] bestForwardBackward()
          Deprecated. Returns the array of tags with the best forward-backward probabilities for each token position.
 double end(int tagId)
          Deprecated. Return the probability of the lattice ending with the specified tag.
 double forward(int tokenIndex, int tagId)
          Deprecated. Returns the forward probability up to the token of the specified index and for the tag of the specified identifier.
 double forwardBackward(int tokenIndex, int tagId)
          Deprecated. Returns the product of the forward and backward probabilities for the token with the specified index and tag with the specified identifier.
 double log2Backward(int tokenIndex, int tagId)
          Deprecated. Returns the log (base 2) backward probability up to the token of the specified index and for the tag of the specified identifier.
 List<ScoredObject<String>> log2ConditionalTagList(int tokenIndex)
          Deprecated. Returns a list of tag-score pairs for the specified token index as scored objects in order of descending score.
 ScoredObject<String>[] log2ConditionalTags(int tokenIndex)
          Deprecated. Use log2ConditionalTagList(int) instead.
 double log2End(int tagId)
          Deprecated. Return the log (base 2) probability of the lattice ending with the specified tag.
 double log2Forward(int tokenIndex, int tagId)
          Deprecated. Returns the log (base 2) of the forward probabilty up to the token of the specified index and for the tag of the specified identifier.
 double log2ForwardBackward(int tokenIndex, int tagId)
          Deprecated. Returns the product of the forward and backward probabilities for the token with the specified index and tag with the specified identifier.
 double log2Start(int tagId)
          Deprecated. Return the log (base 2) probability of the lattice starting with the tag with the specified identifier and emitting the first input token.
 double log2Total()
          Deprecated. Returns the log (base 2) total probability for all paths in the lattice.
 double log2Transitions(int tokenIndex, int sourceTagId, int targetTagId)
          Deprecated. Returns the log (base 2) transtion probability for the specified token index and source and target tag identifiers.
 double logBackward(int token, int tag)
          Deprecated. Returns the log of the backward probability to the specified token and tag.
 double logForward(int token, int tag)
          Deprecated. Return the log of the forward probability of the specified tag at the specified position.
 double logProbability(int tokenIndex, int tagId)
          Deprecated. Convenience method returning the log of the conditional probability that the specified token has the specified tag, given the complete list of input tokens.
 double logProbability(int tokenFrom, int[] tags)
          Deprecated. Return the log conditional probability that the tokens starting with the specified token position have the specified tags given the complete sequence of input tokens.
 double logProbability(int tokenTo, int tagFrom, int tagTo)
          Deprecated. Convenience method returning the log of the conditional probability that the specified two tokens have the specified tag given the complete list of input tokens.
 double logTransition(int tokenFrom, int tagFrom, int tagTo)
          Deprecated. Returns the log of the transition probability from the specified input token position with the specified previous tag to the specified target tag.
 double logZ()
          Deprecated. Return the log of the normalizing constant for the lattice.
 int numTags()
          Deprecated. Return the number of tags in this tag lattice.
 int numTokens()
          Deprecated. Returns the length of this tag lattice as measured by number of tokens.
 double start(int tagId)
          Deprecated. Return the probability of the lattice starting with the tag with the specified identifier and emitting the first input token.
 String tag(int n)
          Deprecated. Return the tag with the specified symbol identifier.
 List<String> tagList()
          Deprecated. Returns an unmodifiable view of the list of tags used in this lattice, indexed by identifier.
 SymbolTable tagSymbolTable()
          Deprecated. Return the symbol table for tags in this tag-word lattice.
 String token(int n)
          Deprecated. Return the token at the specified position in the input.
 List<String> tokenList()
          Deprecated. Return an unmodifiable view of the underlying tokens for this tag lattice.
 String[] tokens()
          Deprecated. Returns the array of tokens underlying this tag-word lattice.
 double total()
          Deprecated. Returns the total probability for all paths in the lattice.
 double transition(int tokenIndex, int sourceTagId, int targetTagId)
          Deprecated. Returns the transtion probability for the specified token index and source and target tag identifiers.
 
Methods inherited from class com.aliasi.tag.TagLattice
tokenClassification
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TagWordLattice

public TagWordLattice(String[] tokens,
                      SymbolTable tagSymbolTable,
                      double[] startProbs,
                      double[] endProbs,
                      double[][][] transitProbs)
Deprecated. 
Construct a tag-word lattice for the specified token inputs and the specified tag symbol table with the specified estimates. This constructor also allocates the forward and backward arrays which are the size of the number of tokens times the number of tags.

There are a number of consistency conditions on the input:

Parameters:
tokens - Array of input tokens.
tagSymbolTable - Symbol table for tags.
startProbs - Array of start probabilities.
endProbs - Array of end probabilities.
transitProbs - Array of transition probabilities.
Throws:
IllegalArgumentException - If any of the probabilities are not between 0.0 and 1.0 inclusive.
Method Detail

tokens

public String[] tokens()
Deprecated. 
Returns the array of tokens underlying this tag-word lattice.

Returns:
The array of tokens for this lattice.

tagSymbolTable

public SymbolTable tagSymbolTable()
Deprecated. 
Return the symbol table for tags in this tag-word lattice.

Specified by:
tagSymbolTable in class TagLattice<String>
Returns:
The symbol table for the lattice.

log2ConditionalTagList

public List<ScoredObject<String>> log2ConditionalTagList(int tokenIndex)
Deprecated. 
Returns a list of tag-score pairs for the specified token index as scored objects in order of descending score. The scores are log (base 2) conditional probabilities of the tag being assigned to the specified token given the token sequence.

Parameters:
tokenIndex - Token index whose tags are returned.
Returns:
Scored tags for the specified index.

log2ConditionalTags

@Deprecated
public ScoredObject<String>[] log2ConditionalTags(int tokenIndex)
Deprecated. Use log2ConditionalTagList(int) instead.

Returns the array of tag-score pairs for the specified token index as scored objects in order of descending score. The scores are log (base 2) conditional probabilities of the tag being assigned to the specified token given the token sequence.

Parameters:
tokenIndex - Token index whose tags are returned.
Returns:
Array of scored tags for the specified index.

bestForwardBackward

public String[] bestForwardBackward()
Deprecated. 
Returns the array of tags with the best forward-backward probabilities for each token position.

Note: This is the independent optimization of each position and is not guaranteed to yield the sequence of states that has the highest probability.

Returns:
Array of tags with the best forward-backward scores.

start

public double start(int tagId)
Deprecated. 
Return the probability of the lattice starting with the tag with the specified identifier and emitting the first input token.

Parameters:
tagId - Identifier for the tag in the symbol table.
Returns:
Start probability.
Throws:
IndexOutOfBoundsException - If the tagId is out of bounds.

log2Start

public double log2Start(int tagId)
Deprecated. 
Return the log (base 2) probability of the lattice starting with the tag with the specified identifier and emitting the first input token. See start(int) for more information.

Parameters:
tagId - Identifier for the tag in the symbol table.
Returns:
Log start probability.
Throws:
IndexOutOfBoundsException - If the tagId is out of bounds.

end

public double end(int tagId)
Deprecated. 
Return the probability of the lattice ending with the specified tag. Note that this does not include the probability of emitting the final token.

Parameters:
tagId - Identifier for the tag in the symbol table.
Returns:
End probability.
Throws:
IndexOutOfBoundsException - If the tag identifier is out of bounds.

log2End

public double log2End(int tagId)
Deprecated. 
Return the log (base 2) probability of the lattice ending with the specified tag. See end(int) for more information.

Parameters:
tagId - Identifier for the tag in the symbol table.
Returns:
Log end probability.
Throws:
IndexOutOfBoundsException - If the tag identifier is out of bounds.

transition

public double transition(int tokenIndex,
                         int sourceTagId,
                         int targetTagId)
Deprecated. 
Returns the transtion probability for the specified token index and source and target tag identifiers. This transition probability includes the transition from the source tag to the target tag times the probability of the target tag emitting the token at the specified index.

Note that the token index cannot be zero here, as it is the index of the target of a transition.

Parameters:
tokenIndex - Index of token.
sourceTagId - Identifier for source tag in symbol table.
targetTagId - Identifier for target tag in symbol table.
Returns:
Transition score from source tag to target tag arriving at the specified token index.
Throws:
IndexOutOfBoundsException - If the token index or either tag identifier is out of bounds.

log2Transitions

public double log2Transitions(int tokenIndex,
                              int sourceTagId,
                              int targetTagId)
Deprecated. 
Returns the log (base 2) transtion probability for the specified token index and source and target tag identifiers. See transition(int,int,int) for more information.

Parameters:
tokenIndex - Index of token.
sourceTagId - Identifier for source tag in symbol table.
targetTagId - Identifier for target tag in symbol table.
Returns:
Log transition probability from source tag to target tag arriving at the specified token index.
Throws:
IndexOutOfBoundsException - If the token index or either tag identifier is out of bounds.

forward

public double forward(int tokenIndex,
                      int tagId)
Deprecated. 
Returns the forward probability up to the token of the specified index and for the tag of the specified identifier. The forward estimate includes the start probabilities and the emissions up to and including the token at the specified index.

Parameters:
tokenIndex - Index of token.
tagId - Identifier of tag in symbol table.
Returns:
Forward probability for the token and tag.
Throws:
IndexOutOfBoundsException - If the token index or the tag identifier is out of bounds.

log2Forward

public double log2Forward(int tokenIndex,
                          int tagId)
Deprecated. 
Returns the log (base 2) of the forward probabilty up to the token of the specified index and for the tag of the specified identifier. See forward(int,int) for more information.

Parameters:
tokenIndex - Index of token.
tagId - Identifier of tag in symbol table.
Returns:
Log forward probability for the token index and tag.
Throws:
IndexOutOfBoundsException - If the token index or the tag identifier is out of bounds.

backward

public double backward(int tokenIndex,
                       int tagId)
Deprecated. 
Returns the backward probability up to the token of the specified index and for the tag of the specified identifier. This includes the stop probability and emissions up to but not including the specified token index.

Parameters:
tokenIndex - Index of token.
tagId - Identifier of tag in symbol table.
Returns:
Backward probability for the token and tag.
Throws:
IndexOutOfBoundsException - If the token index or the tag identifier is out of bounds.

log2Backward

public double log2Backward(int tokenIndex,
                           int tagId)
Deprecated. 
Returns the log (base 2) backward probability up to the token of the specified index and for the tag of the specified identifier. See backward(int,int) for more information.

Parameters:
tokenIndex - Index of token.
tagId - Identifier of tag in symbol table.
Returns:
Log backward probability for the token and tag.
Throws:
IndexOutOfBoundsException - If the token index or the tag identifier is out of bounds.

forwardBackward

public double forwardBackward(int tokenIndex,
                              int tagId)
Deprecated. 
Returns the product of the forward and backward probabilities for the token with the specified index and tag with the specified identifier. Dividing this result by the total probability as given by total() results in the normalized state probability between 0.0 and 1.0. Furthermore, the sum of all forward-backward probabilities at any given token index is equal to the total lattice probability.

Parameters:
tokenIndex - Index of token.
tagId - Identifier of tag in symbol table.
Returns:
Forward-backward probability for the token and tag.
Throws:
IndexOutOfBoundsException - If the token index or the tag identifier is out of bounds.

log2ForwardBackward

public double log2ForwardBackward(int tokenIndex,
                                  int tagId)
Deprecated. 
Returns the product of the forward and backward probabilities for the token with the specified index and tag with the specified identifier. Dividing this result by the total probability as given by total() results in the normalized state probability between 0.0 and 1.0. Furthermore, the sum of all forward-backward probabilities at any given token index is equal to the total lattice probability.

Parameters:
tokenIndex - Index of token.
tagId - Identifier of tag in symbol table.
Returns:
Forward-backward probability for the token and tag.
Throws:
IndexOutOfBoundsException - If the token index or the tag identifier is out of bounds.

total

public double total()
Deprecated. 
Returns the total probability for all paths in the lattice. This probability is the marginal probability of the input tokens. This probability will be equal to the sum of the forward-backward probabilities at any token index. If there are no tokens in the lattice, the total probability is 1.0, and the log probability is 0.0.

The conditional probability of a state at a given token position given the entire input is equal to the forward-backward probability divided by the total probability. The forward-backward probability is the joint probability of the input tokens and state, whereas the total probability is the probability of the input tokens.

Warning: This value is likely to underflow for long inputs; in this case use log2Total() instead. If there are no tokens in the lattice, the total probability is 1.0, and the log probability is 0.0.

Returns:
Total probability for the lattice.

log2Total

public double log2Total()
Deprecated. 
Returns the log (base 2) total probability for all paths in the lattice. See total() for more information.

Returns:
Log total probability for the lattice.

logForward

public double logForward(int token,
                         int tag)
Deprecated. 
Description copied from class: TagLattice
Return the log of the forward probability of the specified tag at the specified position. The forward probability is the sum of the joint probabilities of all sequences from the initial token to the specified token ending with the specified tag.

Specified by:
logForward in class TagLattice<String>
Parameters:
token - Token position.
tag - Tag identifier.
Returns:
Log forward probability specified token has specified tag.

logBackward

public double logBackward(int token,
                          int tag)
Deprecated. 
Description copied from class: TagLattice
Returns the log of the backward probability to the specified token and tag. The backward probability is the sum of the joint probabilities of all sequences starting from the specified token and specified tag and going to the end of the list of tokens.

Specified by:
logBackward in class TagLattice<String>
Parameters:
token - Input token position.
tag - Tag identifier.

logZ

public double logZ()
Deprecated. 
Description copied from class: TagLattice
Return the log of the normalizing constant for the lattice. Its value is the log of the marginal probability of the input tokens. By the additive law of probability, this is equivalent to the sum of the probabilities of all possible analyses for the input sequence

Specified by:
logZ in class TagLattice<String>
Returns:
The normalizing constant.

logTransition

public double logTransition(int tokenFrom,
                            int tagFrom,
                            int tagTo)
Deprecated. 
Description copied from class: TagLattice
Returns the log of the transition probability from the specified input token position with the specified previous tag to the specified target tag.

Specified by:
logTransition in class TagLattice<String>
Parameters:
tokenFrom - Token position from which the transition is made.
tagFrom - Identifier for the previous tag from which the transition is made.
tagTo - Tag identifier for the target tag to which the the transition is made.
Returns:
Log probability of the transition.

logProbability

public double logProbability(int tokenIndex,
                             int tagId)
Deprecated. 
Description copied from class: TagLattice
Convenience method returning the log of the conditional probability that the specified token has the specified tag, given the complete list of input tokens.

This method returns results defined by

 logProbability(n,tag)
     == logProbability(n,new int[] { tag })

Specified by:
logProbability in class TagLattice<String>
Parameters:
tokenIndex - Position of input token.
tagId - Identifier of tag.
Returns:
The log probability the token has the tag.

logProbability

public double logProbability(int tokenTo,
                             int tagFrom,
                             int tagTo)
Deprecated. 
Description copied from class: TagLattice
Convenience method returning the log of the conditional probability that the specified two tokens have the specified tag given the complete list of input tokens.

This method returns results defined by

 logProbability(nTo,tagFrom,tagTo) 
     == logProbability(n-1,new int[] { tagFrom, tagTo })

Specified by:
logProbability in class TagLattice<String>
Parameters:
tokenTo - Position of second token.
tagFrom - First Tag from which transition is made.
tagTo - Second Tag to which transition is made.
Returns:
Log probability of the tags at the specified position.

logProbability

public double logProbability(int tokenFrom,
                             int[] tags)
Deprecated. 
Description copied from class: TagLattice
Return the log conditional probability that the tokens starting with the specified token position have the specified tags given the complete sequence of input tokens.

Specified by:
logProbability in class TagLattice<String>
Parameters:
tokenFrom - Starting position of sequence.
tags - Tag identifiers for sequence.
Returns:
Log probability that sequence starting at the specified position has the specified tags.

numTokens

public int numTokens()
Deprecated. 
Description copied from class: TagLattice
Returns the length of this tag lattice as measured by number of tokens.

Specified by:
numTokens in class TagLattice<String>
Returns:
Number of tokens in this lattice.

tokenList

public List<String> tokenList()
Deprecated. 
Description copied from class: TagLattice
Return an unmodifiable view of the underlying tokens for this tag lattice.

Specified by:
tokenList in class TagLattice<String>
Returns:
The tokens for this lattice.

token

public String token(int n)
Deprecated. 
Description copied from class: TagLattice
Return the token at the specified position in the input.

Specified by:
token in class TagLattice<String>
Parameters:
n - Input position.
Returns:
Token at position.

numTags

public int numTags()
Deprecated. 
Description copied from class: TagLattice
Return the number of tags in this tag lattice.

Specified by:
numTags in class TagLattice<String>
Returns:
Number of tags for this tag lattice.

tag

public String tag(int n)
Deprecated. 
Description copied from class: TagLattice
Return the tag with the specified symbol identifier.

Specified by:
tag in class TagLattice<String>
Parameters:
n - Identifer for tag.
Returns:
Tag with specified ID.

tagList

public List<String> tagList()
Deprecated. 
Description copied from class: TagLattice
Returns an unmodifiable view of the list of tags used in this lattice, indexed by identifier.

Specified by:
tagList in class TagLattice<String>
Returns:
The symbol table for tags.