com.aliasi.features
Class ChunkerFeatureExtractor
java.lang.Object
com.aliasi.features.ChunkerFeatureExtractor
- All Implemented Interfaces:
- FeatureExtractor<CharSequence>, Serializable
public class ChunkerFeatureExtractor
- extends Object
- implements FeatureExtractor<CharSequence>, Serializable
A ChunkerFeatureExtractor implements a feature extractor
for character sequences based on a specified chunker. Feature
names are derived from the chunk types optionally concatenated to
the phrase making up the chunk. Feature values are the count of
their occurrences.
For instance, if a chunker were to return a chunk of type PER spanning the phrase John and a chunk of type LOC spanning the phrase New York, then the features will
be PER:1, LOC:1 if the phrases are not included and
PER_John:1, LOC_New York:1. If the phrase John
had shown up three times, the value for PER_John would
be 3 (assuming types are included).
Serialization
A chunker-based feature extractor will be serializable if its
underlying chunker is serializable.
Thread Safety
Upon safe publishing, a chunker feature extractor will be
thread safe if its underlying chunker is thread safe.
- Since:
- Lingpipe3.9.2
- Version:
- 3.9.2
- Author:
- Bob Carpenter
- See Also:
- Serialized Form
|
Constructor Summary |
ChunkerFeatureExtractor(Chunker chunker,
boolean includePhrase)
Construct a new chunker feature extractor based on the
specified chunker, including the phrases extracted if the
specified flag is true. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ChunkerFeatureExtractor
public ChunkerFeatureExtractor(Chunker chunker,
boolean includePhrase)
- Construct a new chunker feature extractor based on the
specified chunker, including the phrases extracted if the
specified flag is true.
- Parameters:
chunker - Base chunker for the extractor.includePhrase - Set to true to append the
phrase derived from the chunk to the feature name.
features
public Map<String,? extends Number> features(CharSequence in)
- Description copied from interface:
FeatureExtractor
- Return the feature vector for the specified input.
- Specified by:
features in interface FeatureExtractor<CharSequence>
- Parameters:
in - Input object.
- Returns:
- The feature vector for the specified input.