com.aliasi.features
Class ChunkerFeatureExtractor

java.lang.Object
  extended by com.aliasi.features.ChunkerFeatureExtractor
All Implemented Interfaces:
FeatureExtractor<CharSequence>, Serializable

public class ChunkerFeatureExtractor
extends Object
implements FeatureExtractor<CharSequence>, Serializable

A ChunkerFeatureExtractor implements a feature extractor for character sequences based on a specified chunker. Feature names are derived from the chunk types optionally concatenated to the phrase making up the chunk. Feature values are the count of their occurrences.

For instance, if a chunker were to return a chunk of type PER spanning the phrase John and a chunk of type LOC spanning the phrase New York, then the features will be PER:1, LOC:1 if the phrases are not included and PER_John:1, LOC_New York:1. If the phrase John had shown up three times, the value for PER_John would be 3 (assuming types are included).

Serialization

A chunker-based feature extractor will be serializable if its underlying chunker is serializable.

Thread Safety

Upon safe publishing, a chunker feature extractor will be thread safe if its underlying chunker is thread safe.

Since:
Lingpipe3.9.2
Version:
3.9.2
Author:
Bob Carpenter
See Also:
Serialized Form

Constructor Summary
ChunkerFeatureExtractor(Chunker chunker, boolean includePhrase)
          Construct a new chunker feature extractor based on the specified chunker, including the phrases extracted if the specified flag is true.
 
Method Summary
 Map<String,? extends Number> features(CharSequence in)
          Return the feature vector for the specified input.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChunkerFeatureExtractor

public ChunkerFeatureExtractor(Chunker chunker,
                               boolean includePhrase)
Construct a new chunker feature extractor based on the specified chunker, including the phrases extracted if the specified flag is true.

Parameters:
chunker - Base chunker for the extractor.
includePhrase - Set to true to append the phrase derived from the chunk to the feature name.
Method Detail

features

public Map<String,? extends Number> features(CharSequence in)
Description copied from interface: FeatureExtractor
Return the feature vector for the specified input.

Specified by:
features in interface FeatureExtractor<CharSequence>
Parameters:
in - Input object.
Returns:
The feature vector for the specified input.