|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.tokenizer.ModifiedTokenizerFactory
com.aliasi.tokenizer.ModifyTokenTokenizerFactory
com.aliasi.tokenizer.PorterStemmerTokenizerFactory
public class PorterStemmerTokenizerFactory
A PorterStemmerTokenizerFactory applies Porter's stemmer
to the tokenizers produced by a base tokenizer factory.
Porter's stemmer computes an approximation of converting words
to their morphological base form. This class provides a single
top-level static method, stem(String), which returns a
stemmed form of an input string.
The underlying stemming code is Martin Porter's own public domain Java port of his original C implementation of stemming. More information can be found at:
Porter Stemmer Home Page
The original paper describing Porter's stemmer is:
Porter, Martin. 1980. An algorithm for suffix stripping. Program. 14:3. 130--137.
| Constructor Summary | |
|---|---|
PorterStemmerTokenizerFactory(TokenizerFactory factory)
Construct a tokenizer factory that applies Porter stemming to the tokenizers produced by the specified base factory. |
|
| Method Summary | |
|---|---|
String |
modifyToken(String token)
Returns the Porter stemmed version of the specified token. |
static String |
stem(String in)
Return the stem of the specified input string using the Porter stemmer. |
| Methods inherited from class com.aliasi.tokenizer.ModifyTokenTokenizerFactory |
|---|
modify, modifyWhitespace |
| Methods inherited from class com.aliasi.tokenizer.ModifiedTokenizerFactory |
|---|
baseTokenizerFactory, tokenizer |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public PorterStemmerTokenizerFactory(TokenizerFactory factory)
factory - Base tokenizer factory.| Method Detail |
|---|
public String modifyToken(String token)
modifyToken in class ModifyTokenTokenizerFactorytoken - Token to stem.
public static String stem(String in)
in - String to stem.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||