|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.hmm.HmmDecoder
public class HmmDecoder
An HmmDecoder provides implementations of first-best,
n-best and marginal taggers for hidden Markov models (HMMs). A
decoder is constructed from a hidden Markov model.
Tagger, which
specifies a first-best tagging method tag(List).
This method provides the likely (first best) path of HMM states
(tags) given a sequence of string emissions (outputs). First-best
decoding is implemented using Viterbi's dynamic programming
algorithm.
HMM decoders also implement the interface NBestTagger,
which specifies the method tagNBest(List,int) and tagNBestConditional(List,int). These methods both return an
iterator over scored taggings. N-best decoding is implemented
using the Viterbi algorithm in a forward pass and the A*
algorithm in the backward pass using the Viterbi estimates as exact
completion estimates. The variant conditional method further
normalizes outputs to posterior conditional probabilities, and is
a bit more expensive to compute.
HMM decoders also implement the MarginalTagger
interface, which specifies a method tagMarginal(List) for
providing marginal probability estimates for categories for a token
given the input string. Marginal decoding is implemented using the
standard forward-backward algorithm. The lattice is an instance of
TagWordLattice, which itself implements TagLattice;
see that class's documentation for information on how to retrieve
cumulative (total) probabilities for input token sequences and
posterior conditional probabilities (confidence scores) per token.
The decoder is able to cache emission probabilities, the
computation of which is often the bottleneck in decoding. Caches
may be provided in the constructor for both ordinary (linear)
probabilities and log probabilities of emissions. A cache is
simply an instance of Map from strings to arrays of doubles.
The first-best and n-best decoders only uses log probabilities, whereas the n-best normalized and lattice decoders use linear probabilities. Only the probabilities computed are cached, so if a program only does first-best processing, only linear estimates are cached.
Any implementation of Map may be used as a cache,
but particular attention must be paid to thread safety and
scalability. A fully synchronized cache can be created with:
Map<String,double[]> cache
= java.util.Collections.synchronizedMap(new HashMap<String,double[]>());
LingPipe's map implementation FastCache is
designed specifically to be used as a cache in settings such as
these.
It is often (e.g. on English newsire) easy to get high token coverage (e.g. 97%) with a rather modestly sized cache (e.g. 100K tokens). Other corpora and languages may vary and we encourage experimentation with efficiency versus memory for caching. Note that run times will speed up as more and more estimates are returned from the cache rather than being computed directly.
This class does not perform any underlying sychronization. If
the hidden Markov model is not thread safe, then it must be
synchronized. Similarly for the caches. Note that FastCache, while not synchronized, is thread safe.
Similarly, the compilation of an HMM trained with HmmCharLmEstimator is thread safe, in fact allowing safe
concurrent access because it is immutable.
For first-best and n-best decoding, a beam may be used to prune
unlikely hypotheses. This beam is set during construction or
through the method setLog2EmissionBeam(double) (setting
and access must be concurrent read/exclusive write synchronized
from the caller). The beam works token by token. As each token is
considered, any tag whose emission log (base 2) likelihood is more
than the beam less than the bes5t emission estimate is eliminated
from further consideration.
| Constructor Summary | |
|---|---|
HmmDecoder(HiddenMarkovModel hmm)
Construct an HMM decoder from the specified HMM. |
|
HmmDecoder(HiddenMarkovModel hmm,
Map<String,double[]> emissionCache,
Map<String,double[]> emissionLog2Cache)
Construct an HMM decoder from the specified HMM using the specified caches for linear and log probabilities. |
|
HmmDecoder(HiddenMarkovModel hmm,
Map<String,double[]> emissionCache,
Map<String,double[]> emissionLog2Cache,
double log2Beam,
double log2EmissionBeam)
Construct an HMM decoder from the specified HMM using the specified caches for linear and log probabilities, with the specified beam width for emission estimates. |
|
| Method Summary | |
|---|---|
Map<String,double[]> |
emissionCache()
Returns the mapping used to cache emission probabilities, or null if not caching. |
Map<String,double[]> |
emissionLog2Cache()
Returns the mapping used to cache log (base 2) emission probabilities, or null if not caching. |
String[] |
firstBest(String[] emissions)
Deprecated. Use tag(List) instead. |
HiddenMarkovModel |
getHmm()
Returns the hidden Markov model underlying this decoder. |
TagWordLattice |
lattice(String[] emissions)
Deprecated. Use tagMarginal(List) instead. |
Iterator<ScoredObject<String[]>> |
nBest(String[] emissions)
Deprecated. Use tagNBest(List,int) instead. |
Iterator<ScoredObject<String[]>> |
nBest(String[] emissions,
int maxN)
Deprecated. Use tagNBest(List,int) instead. |
Iterator<ScoredObject<String[]>> |
nBestConditional(String[] emissions)
Deprecated. Use tagNBestConditional(List,int) instead. |
void |
setEmissionCache(Map<String,double[]> cache)
Sets the emission cache to the specified value. |
void |
setEmissionLog2Cache(Map<String,double[]> cache)
Sets the log emission cache to the specified value. |
void |
setLog2Beam(double log2Beam)
Sets the value of the log2 beam to the specified value. |
void |
setLog2EmissionBeam(double log2EmissionBeam)
Sets the log (base 2) emission beam width. |
Tagging<String> |
tag(List<String> tokens)
Return the tagging for the specified list of tokens. |
TagLattice<String> |
tagMarginal(List<String> tokens)
Return the marginal tagging for the specified list of input tokens. |
Iterator<ScoredTagging<String>> |
tagNBest(List<String> tokens,
int maxResults)
Return an iterator over the n-best scored taggings for
the specified input tokens up to a specified maximum n. |
Iterator<ScoredTagging<String>> |
tagNBestConditional(List<String> tokens,
int maxResults)
Return an iterator over the n-best scored taggings for
the specified input tokens up to a specified maximum n,
with scores normalized to conditional probabilities. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public HmmDecoder(HiddenMarkovModel hmm)
hmm - Model to use as basis of decoding.
public HmmDecoder(HiddenMarkovModel hmm,
Map<String,double[]> emissionCache,
Map<String,double[]> emissionLog2Cache)
null, in which case the
corresponding values will not be cached.
hmm - Model to use for decoding.emissionCache - Map to use for emission caching.emissionLog2Cache - Map to use for log emission caching.
public HmmDecoder(HiddenMarkovModel hmm,
Map<String,double[]> emissionCache,
Map<String,double[]> emissionLog2Cache,
double log2Beam,
double log2EmissionBeam)
null, in which case the
corresponding values will not be cached.
hmm - Model to use for decoding.emissionCache - Map to use for emission caching.emissionLog2Cache - Map to use for log emission caching.log2Beam - The log (base 2) beam for pruning full hypotheses.log2EmissionBeam - The log (base 2) beam for pruning emission hypotheses.
IllegalArgumentException - If either beam is not a
non-negative number.| Method Detail |
|---|
public HiddenMarkovModel getHmm()
public Map<String,double[]> emissionCache()
null if not caching. This is the actual mapping,
so changes to it will affect this decoder.
public Map<String,double[]> emissionLog2Cache()
null if not caching. This is
the actual mapping, so changes to it will affect this decoder.
public void setEmissionCache(Map<String,double[]> cache)
Warning: This method should not be executed concurrently with any calls to decoding, as it may produce an inconsistent result. The typical application will be to set a cache before using a decoder.
cache - Cache for linear emission estimates.public void setLog2EmissionBeam(double log2EmissionBeam)
log2EmissionBeam - Width of beam.
IllegalArgumentException - If the beam width is negative.public void setLog2Beam(double log2Beam)
log2Beam - The log (base 2) Viterbi beam.
IllegalArgumentException - If the beam width is negative.public void setEmissionLog2Cache(Map<String,double[]> cache)
Warning: This method should not be executed concurrently with any calls to decoding, as it may produce an inconsistent result. The typical application will be to set a cache before using a decoder.
cache - Cache for linear emission estimates.@Deprecated public TagWordLattice lattice(String[] emissions)
tagMarginal(List) instead.
Implementation Note: This method is implemented by
the standard forward-backward dynamic programming algorithm.
The estimates
Pα(n,state) are of
the probability of a derivation starting at the beginning and
arriving at the specified state after the specified number of
tokens.
Pα(0,state)
= Pstart(state)
* Pemit(emissions[0],state)
Pα(n,state)
= ΣsourceState
Pα(n-1,sourceState)
* Ptransit(state|sourceState)
* Pemit(emissions[n]|state)
Note that the forward probabilities up to a token position
include the emission probability for that token.
The backward values are defined dually as the probability of a derivation ending at a specified state being continued to the end of the analysis.
Pβ(|emissions|-1,state)
= Pend(state)
Pβ(n,state)
= ΣtargetState
Pβ(n+1,targetState)
* Ptransit(targetState|state)
* Pemit(emissions[n+1]|targetState)
Note that token emission probabilities for a given state are not
included in the backward score; they are computed with target
states in a way that matches the forward algorithm's
definition. This asymmetry is so that the forward-backward
estimates Pγ(n,state)
correspond to the probability of a derivation being in a given
state for a given position given the input:
Pγ(n,state)
= Pα(n,state)
* Pβ(n,state)
These values include the token emission probabilities,
but may be normalized in the usual Bayesian fashion by
dividing by the marginal P(emissions):
P(n,state|emissions)
= Pγ(n,state) / P(emissions)
where the marginal is just the sum of all forward-backward
values for any token position i:
P(emissions)
= Σstate
Pγ(i,state)
Most of the computation is carried out by the constructor
TagWordLattice.TagWordLattice(String[],SymbolTable,double[],double[],double[][][]) with the following start, end and transition arrays:
start(state) = Pstart(0,state)
* Pemit(emissions[0],state)
end(state) = Pend(state)
transition(i,sourceState,targetState)
= Ptransit(targetState|sourceState)
* Pemit(emissions[i]|targetState)
emissions - Array of strings emitted.
@Deprecated public String[] firstBest(String[] emissions)
tag(List) instead.
Implementation Note: This method is implemented with
the Viterbi algorithm. The Viterbi algorithm uses dynamic
programming (memoization) to compute the maximum probability of
arriving in a state state after consuming
inputs emissions[0],...,emissions[n-1]:
Pbest(0,state)
= Pstart(state)
* Pemit(emissions[0],state)
Pbest(n,state)
= MAXprevState
Pbest(n-1,prevState)
* Ptransit(state|prevState)
* Pemit(emissions[n]|state)
Pbest(last,state)
*= Pend(state)
Note that the initial condition uses the start probability
rather than the transition times the previous best probability.
The notation in the last line is meant to indicate that
the last index has the probability of a state being the
last state multiplied in. As usual, we use logarithms
and additions rather than multiplication.
As usual, the algorithm employs an array of backpointers from a state at a given input to the last state along the best path. This is computed by simply recording the state maximizing the above equation:
backPtr(0,state) = null
backPtr(n,state)
= ARGMAXprevState
Pbest(n-1,prevState)
* Ptransit(state|prevState)
* Pemit(emissions[n]|state)
By tracing the array of backpointers from the best final
state, the best path can be recovered.
emissions - Array of strings emitted.
@Deprecated public Iterator<ScoredObject<String[]>> nBest(String[] emissions)
tagNBest(List,int) instead.
ScoredObject instances
consisting of arrays of tags and log (base 2) joint likelihoods
of the tags and emissions with respect to the underlying HMM.
Only analyses with non-zero probability estimates are returned.
Implementation Note: This method is implemented by
doing a Viterbi search to provide exact A* bounds
for a backwards n-best pass using the A* algorithm.
Thus it will be slower than just computing the first best
result using firstBest(String[]). The iterator stores
the entire Viterbi lattice as well as a priority queue of
partial states ordered by the A* condition.
emissions - String outputs whose tag sequences are returned.
@Deprecated
public Iterator<ScoredObject<String[]>> nBest(String[] emissions,
int maxN)
tagNBest(List,int) instead.
ScoredObject instances
consisting of arrays of tags and log (base 2) joint likelihoods
of the tags and emissions with respect to the underlying HMM up
to the specified maximum number of results.
Implementation Note: This method is implemented by
doing a Viterbi search to provide exact A* bounds
for a backwards n-best pass using the A* algorithm.
Thus it will be slower than just computing the first best
result using firstBest(String[]). The iterator stores
the entire Viterbi lattice as well as a priority queue of
partial states ordered by the A* condition.
emissions - String outputs whose tag sequences are returned.
@Deprecated public Iterator<ScoredObject<String[]>> nBestConditional(String[] emissions)
tagNBestConditional(List,int) instead.
Conditional estimates of tags given emissions are derived from dividing the joint estimates by the marginal likelihood of the emissions as computed by summing over all joint estimates.
Implementation Note: The total log likelihood is
returned by applying TagWordLattice.log2Total() to the
result of decoding the input with lattice(String[]).
The joint estimates are iterated using the iterator returned by
nBest(String[]) and then modified by subtracting the
emission marginal log likelihood from the joint emission/tags
log likelihood. This method adds the cost of the full lattice
computation to the joint n-best method. The space for the full
lattice is used transiently when this method is called and
may be garbage-collected even before the first element is returned
by the iterator.
emissions - String outputs whose tag sequences are returned.
public Tagging<String> tag(List<String> tokens)
Tagger
tag in interface Tagger<String>tokens - Input tokens to tag.
public Iterator<ScoredTagging<String>> tagNBest(List<String> tokens,
int maxResults)
NBestTaggern-best scored taggings for
the specified input tokens up to a specified maximum n.
tagNBest in interface NBestTagger<String>tokens - Input tokens to tag.maxResults - Maximum number of results to return.
public Iterator<ScoredTagging<String>> tagNBestConditional(List<String> tokens,
int maxResults)
NBestTaggern-best scored taggings for
the specified input tokens up to a specified maximum n,
with scores normalized to conditional probabilities.
Optional operation.
tagNBestConditional in interface NBestTagger<String>tokens - Input tokens to tag.maxResults - Maximum number of results to return.
public TagLattice<String> tagMarginal(List<String> tokens)
MarginalTagger
tagMarginal in interface MarginalTagger<String>tokens - Input tokens to tag.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||