com.aliasi.tokenizer
Class WhitespaceNormTokenizerFactory
java.lang.Object
com.aliasi.tokenizer.ModifiedTokenizerFactory
com.aliasi.tokenizer.ModifyTokenTokenizerFactory
com.aliasi.tokenizer.WhitespaceNormTokenizerFactory
- All Implemented Interfaces:
- TokenizerFactory, Serializable
public class WhitespaceNormTokenizerFactory
- extends ModifyTokenTokenizerFactory
- implements Serializable
A WhitespaceNormTokenizerFactory filters the tokenizers produced
by a base tokenizer factory to convert non-empty whitespaces to a single
space and leave empty (zero-length) whitespaces alone.
Thread Safety
A whitespace normalizing tokenizer factory is thread
safe if its base tokenizer factory is thread safe.
Serialization
A whitespace normalizing tokenizer factory is serializable if its
base tokenizer factory is serializable.
- Since:
- Lingpipe3.8
- Version:
- 3.8
- Author:
- Bob Carpenter
- See Also:
- Serialized Form
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
WhitespaceNormTokenizerFactory
public WhitespaceNormTokenizerFactory(TokenizerFactory factory)
- Construct a whitespace normalizing tokenizer factory from the
specified base factory.
- Parameters:
factory - Base tokenizer factory.
modifyWhitespace
public String modifyWhitespace(String whitespace)
- Return the normalized form of the specified whitespace.
- Overrides:
modifyWhitespace in class ModifyTokenTokenizerFactory
- Parameters:
whitespace - Input whitespace.
- Returns:
- Normalized whitespace.