|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.tokenizer.RegExTokenizerFactory
com.aliasi.tokenizer.LineTokenizerFactory
public class LineTokenizerFactory
A LineTokenizerFactory treats each line of an input as
a token. Whitespaces separating lines are simply newlines. This
is useful for decoders that work at the line level.
Line terminators are as defined in Pattern,
and include all of the Windows, Unix, and Macintosh standards, as well
as some unicode extensions.
Whitespaces will be either empty strings or strings representing one or more newlines.
Tokens may consist entirely of whitespace characters if whitespace is the only thing on a line. But tokens will never contain sequences representing newlines. Tokens will alwyas consist of at least one character.
Input String Tokens Whitespaces ""{}{ "" }"abc"{ "abc" }{ "", "" }"abc\ndef"{ "abc", "def" }{ "", "\n", "" }"abc\r\ndef"{ "abc", "def" }{ "", "\r\n", "" }"abc\r\ndef"{ "abc", "def" }{ "", "\r\n", "" }" abc\n def \n"{ " abc", " def " }{ "", "\n", "\n" }" \n"{ " " }{ "", "\n" }
A line tokenizer factory may be serialized. Upon
deserialization, the resulting class will be the singleton
item INSTANCE.
This tokenizer factory is nothing more than a convenience
wrapper around a very simple RegExTokenizerFactory, with
the simplest possible regular expression:
RegExTokenizerFactory(".+")
Because the regular expression tokenizer factory takes the
default regular expression flags (see Pattern),
the period (.) matches any character except a newline.
| Field Summary | |
|---|---|
static LineTokenizerFactory |
INSTANCE
A reusable instance of this class. |
| Method Summary | |
|---|---|
String |
toString()
Returns a string representation of this factory, consisting of its name. |
| Methods inherited from class com.aliasi.tokenizer.RegExTokenizerFactory |
|---|
pattern, tokenizer |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final LineTokenizerFactory INSTANCE
| Method Detail |
|---|
public String toString()
toString in class RegExTokenizerFactory
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||