|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.lm.CharSeqMultiCounter
public class CharSeqMultiCounter
A CharSeqMultiCounter combines the counts from a pair
of character sequence counters. The returned values are the values
resulting from combining the counts in both counters.
Multi-counters are particularly useful in situations where a large or constant background counter must be updated several different ways simultaneously. For instance, a general 5-gram counter of a language trained over a lot of data might be combined with an 8-gram topic-specific model for use in a classifier.
More than two counters may be combined by combining them two at
a time. The best strategy is to combine them two at a time into a
balanced tree of counters, as done by the constructor CharSeqMultiCounter(CharSeqCounter[]). For instance, with
CharSeqCounter instances c1,
c2, c3, and c4, the balanced
construction of c1234 in:
CharSeqCounter c12 = new CharSeqMultiCounter(c1,c2); CharSeqCounter c34 = new CharSeqMultiCounter(c3,c4); CharSeqCounter c1234 = new CharSeqMultiCounter(c12,c34);is more efficient for many operations than the linear construction in:
CharSeqCounter c12 = new CharSeqMultiCounter(c1,c2); CharSeqCounter c123 = new CharSeqMultiCounter(c12,c3); CharSeqCounter c1234 = new CharSeqMultiCounter(c123,c4);
Implementation Note: The methods numCharactersFollowing(char[],int,int), charactersFollowing(char[],int,int), and observedCharacters() all call the contained counters' CharSeqCounter.charactersFollowing(char[],int,int) methods and
then merge or count results. All other methods only perform
arithmetic on the result of the corresponding method call son the
contained counters.
| Constructor Summary | |
|---|---|
CharSeqMultiCounter(CharSeqCounter[] counters)
Construct a character sequence counter from the specified array of counters. |
|
CharSeqMultiCounter(CharSeqCounter counter1,
CharSeqCounter counter2)
Construct a multi-counter from the specified pair of counters. |
|
| Method Summary | |
|---|---|
char[] |
charactersFollowing(char[] cs,
int start,
int end)
Returns the array of characters that have been observed following the specified character slice in unicode order. |
long |
count(char[] cs,
int start,
int end)
Returns the count for the specified character sequence. |
long |
extensionCount(char[] cs,
int start,
int end)
Returns the sum of the counts of all character sequences one character longer than the specified character slice. |
int |
numCharactersFollowing(char[] cs,
int start,
int end)
Returns the number of characters that when appended to the end of the specified character slice produce an extended slice with a non-zero count. |
char[] |
observedCharacters()
Returns an array consisting of the characters with non-zero count in unicode order. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public CharSeqMultiCounter(CharSeqCounter[] counters)
counters - Array of counters to back multicounter.
IllegalArgumentException - If the list of counters is
less than two elements long.
public CharSeqMultiCounter(CharSeqCounter counter1,
CharSeqCounter counter2)
counter1 - First counter in multi-counter.counter2 - Second counter in multi-counter.| Method Detail |
|---|
public long count(char[] cs,
int start,
int end)
CharSeqCounter
count in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - Index of one past last character in slice.
public long extensionCount(char[] cs,
int start,
int end)
CharSeqCounter
extensionCount in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - Index of one past last character in slice.
public int numCharactersFollowing(char[] cs,
int start,
int end)
CharSeqCounter
numCharactersFollowing(cSlice)
= | { c | count(cSlice.c) > 0 } |
where count(cSlice.c) represents the count
of the character slice cSlice suffixed with the
character c.
numCharactersFollowing in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - One plus index of last character in slice.
public char[] charactersFollowing(char[] cs,
int start,
int end)
CharSeqCounter
charactersFollowing in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - One plus index of last character in slice.
public char[] observedCharacters()
CharSeqCountercharactersFollowing(new
char[0],0,0).
observedCharacters in interface CharSeqCounter
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||