|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.lm.UniformProcessLM
public class UniformProcessLM
A UniformLM.Sequence implements a uniform sequence
language model with a specified number of outcomes and the same
probability assigned to the end-of-stream marker. The formula
for computing sequence likelihood estimates is:
log2Estimate(cSeq) =
= log2 ( (cSeq.length()+1) / (numOutcomes+1) )
Adding one to the number of outcomes makes the end-of-sequence
just as likely as any other character. Adding one to the
sequence length adds the log likelihood of the end-of-sequence
marker itself.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from interface com.aliasi.lm.LanguageModel |
|---|
LanguageModel.Conditional, LanguageModel.Dynamic, LanguageModel.Process, LanguageModel.Sequence, LanguageModel.Tokenized |
| Constructor Summary | |
|---|---|
UniformProcessLM()
Construct a uniform process language model with a number of outcomes equal to the total number of characters. |
|
UniformProcessLM(double crossEntropyRate)
Construct a uniform process language model with the specified character cross-entropy rate. |
|
UniformProcessLM(int numOutcomes)
Construct a uniform process language model with the specified number of outcomes. |
|
| Method Summary | |
|---|---|
void |
compileTo(ObjectOutput objOut)
Writes a compiled version of this model to the specified object output. |
void |
handle(CharSequence cs)
This method for training a character sequence is supplied for compatibility with the dynamic language model interface, but is implemented to do nothing. |
double |
log2Estimate(char[] cs,
int start,
int end)
Returns an estimate of the log (base 2) probability of the specified character slice. |
double |
log2Estimate(CharSequence cSeq)
Returns an estimate of the log (base 2) probability of the specified character sequence. |
int |
numOutcomes()
Returns the number of outcomes for this uniform model. |
void |
train(char[] cs,
int start,
int end)
Ignores the training data. |
void |
train(char[] cs,
int start,
int end,
int count)
Ignores the training data. |
void |
train(CharSequence cSeq)
Ignores the training data. |
void |
train(CharSequence cSeq,
int count)
Ignores the training data. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public UniformProcessLM()
public UniformProcessLM(int numOutcomes)
1/numOutcomes.
numOutcomes - The number of outcomes for this language
model.public UniformProcessLM(double crossEntropyRate)
log2 P(cs)
= - crossEntropyRate * cs.length()
The number of outcomes is set by rounding down the exponent
of the cross-entropy:
numOutcomes = (int) 2.0crossEntropyRate
crossEntropyRate - Character cross-entropy rate of the
uniform model.| Method Detail |
|---|
public int numOutcomes()
public void compileTo(ObjectOutput objOut)
throws IOException
UniformProcessLM.
compileTo in interface CompilableobjOut - Object output to which this model is written.
IOException - If there is an I/O error during the write.public void handle(CharSequence cs)
handle in interface ObjectHandler<CharSequence>cs - Ignored.
public void train(char[] cs,
int start,
int end)
train in interface LanguageModel.Dynamiccs - Ignored.start - Ignored.end - Ignored.
public void train(char[] cs,
int start,
int end,
int count)
train in interface LanguageModel.Dynamiccs - Ignored.start - Ignored.end - Ignored.count - Ignored.public void train(CharSequence cSeq)
train in interface LanguageModel.DynamiccSeq - Ignored.
public void train(CharSequence cSeq,
int count)
train in interface LanguageModel.DynamiccSeq - Ignored.count - Ignored.
public double log2Estimate(char[] cs,
int start,
int end)
LanguageModel
log2Estimate in interface LanguageModelcs - Underlying array of characters.start - Index of first character in slice.end - One plus index of last character in slice.
public double log2Estimate(CharSequence cSeq)
LanguageModel
log2Estimate in interface LanguageModelcSeq - Character sequence to estimate.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||