com.aliasi.tokenizer
Interface TokenizerFactory

All Known Implementing Classes:
CharacterTokenizerFactory, EnglishStopTokenizerFactory, IndoEuropeanTokenizerFactory, LineTokenizerFactory, LowerCaseTokenizerFactory, ModifiedTokenizerFactory, ModifyTokenTokenizerFactory, NGramTokenizerFactory, PorterStemmerTokenizerFactory, RegExFilteredTokenizerFactory, RegExTokenizerFactory, SoundexTokenizerFactory, StopTokenizerFactory, TokenLengthTokenizerFactory, WhitespaceNormTokenizerFactory

public interface TokenizerFactory

A TokenizerFactory constructors tokenizers from subsequences of character arrays.

Tokenizer factories are typically implemented to be serializable so that they may be serialized along with the models that depend on them.

Since:
LingPipe1.0
Version:
1.0
Author:
Bob Carpenter

Method Summary
 Tokenizer tokenizer(char[] ch, int start, int length)
          Returns a tokenizer for the specified subsequence of characters.
 

Method Detail

tokenizer

Tokenizer tokenizer(char[] ch,
                    int start,
                    int length)
Returns a tokenizer for the specified subsequence of characters.

Parameters:
ch - Characters to tokenize.
start - Index of first character to tokenize.
length - Number of characters to tokenize.