|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
public interface LanguageModel
A LanguageModel provides an estimate of the probability of a
sequence of characters. Sequences of characters may be specified
via an array slice or with a Java CharSequence, which is an
interface implemented by String, StringBuffer and
the new I/O buffer class CharBuffer.
There are several subinterfaces of language model. The primary
distinction is between LanguageModel.Sequence
and LanguageModel.Process, which place different normalization
requirements on their estimates. Sequence models require the sum
of the estimates to be 1.0 over all character sequences, whereas a
process requires for each length that the sum of estimates to be
1.0 over all sequences of that length. Every language model should
be marked by one of these two sub-interfaces.
The LanguageModel.Conditional interface provides additional methods
for conditional estimates. The LanguageModel.Dynamic interface provides
a method for training the model with sample character sequence
data. Finally, several of the language model implementations are
serializable to an object output stream.
| Nested Class Summary | |
|---|---|
static interface |
LanguageModel.Conditional
A LanguageModel.Conditional is a language model
that implements conditional estimates of characters given
previous characters. |
static interface |
LanguageModel.Dynamic
A LanguageModel.Dynamic accepts training events in
the form of character slices or sequences. |
static interface |
LanguageModel.Process
A LanguageModel.Process is normalized by length. |
static interface |
LanguageModel.Sequence
A LanguageModel.Sequence is normalized over all
character sequences. |
static interface |
LanguageModel.Tokenized
A LanguageModel.Tokenized provides a means of
estimating the probability of a sequence of tokens. |
| Method Summary | |
|---|---|
double |
log2Estimate(char[] cs,
int start,
int end)
Returns an estimate of the log (base 2) probability of the specified character slice. |
double |
log2Estimate(CharSequence cs)
Returns an estimate of the log (base 2) probability of the specified character sequence. |
| Method Detail |
|---|
double log2Estimate(char[] cs,
int start,
int end)
cs - Underlying array of characters.start - Index of first character in slice.end - One plus index of last character in slice.
IndexOutOfBoundsException - If the start and end minus
one points are outside of the bounds of the character array.double log2Estimate(CharSequence cs)
cs - Character sequence to estimate.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||