com.aliasi.tokenizer
Class ModifiedTokenizerFactory

java.lang.Object
  extended by com.aliasi.tokenizer.ModifiedTokenizerFactory
All Implemented Interfaces:
TokenizerFactory
Direct Known Subclasses:
ModifyTokenTokenizerFactory

public abstract class ModifiedTokenizerFactory
extends Object
implements TokenizerFactory

A ModifiedTokenizerFactory is an abstract tokenizer factory that modifies a tokenizer returned by a base tokenizer factory.

The abstract method modify(Tokenizer) implements the modification.

Since:
Lingpipe3.8
Version:
3.8
Author:
Bob Carpenter

Constructor Summary
ModifiedTokenizerFactory(TokenizerFactory baseFactory)
          Construct a modified tokenizer factory with the specified base factory.
 
Method Summary
 TokenizerFactory baseTokenizerFactory()
          Return the base tokenizer factory.
protected abstract  Tokenizer modify(Tokenizer tokenizer)
          Return a modified form of the specified tokenizer.
 Tokenizer tokenizer(char[] cs, int start, int length)
          Return the tokenizer for the specified character array slice, which is generated by the base tokenizer and modified with the modify method.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ModifiedTokenizerFactory

public ModifiedTokenizerFactory(TokenizerFactory baseFactory)
Construct a modified tokenizer factory with the specified base factory.

Parameters:
baseFactory - Underlying tokenizer factory whose tokenizers are modified.
Method Detail

baseTokenizerFactory

public TokenizerFactory baseTokenizerFactory()
Return the base tokenizer factory.

Returns:
The base tokenizer factory.

tokenizer

public Tokenizer tokenizer(char[] cs,
                           int start,
                           int length)
Return the tokenizer for the specified character array slice, which is generated by the base tokenizer and modified with the modify method.

Specified by:
tokenizer in interface TokenizerFactory
Parameters:
cs - Characters to tokenize.
start - Index of first character to tokenize.
length - Number of characters to tokenize.

modify

protected abstract Tokenizer modify(Tokenizer tokenizer)
Return a modified form of the specified tokenizer. This method is used to modify the tokenizer produced by the base tokenizer in a call to tokenizer(char[],int,int).

Parameters:
tokenizer - Tokenizer to modify.
Returns:
The modified tokenizer.