|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.features.FeatureExtractorFilter<E>
com.aliasi.features.ZScoreFeatureExtractor<E>
E - The type of object whose features are extracted.public class ZScoreFeatureExtractor<E>
A ZScoreFeatureExtractor converts features to their
z-scores, where means and deviations are determined by a
corpus supplied at compile time.
Means and standard deviations are computed for each feature in the training section of the corpus supplied to the constructor.
At run time, feature values are converted to z-scores, by:
wherez(feat,val) = (val - mean(feat))/stdDev(feat)
feat is the feature, val is the value
to be converted to a z-score, mean(feat) is the mean
(average) of the feature in the training corpus, and
stdDev(feat) is the standard deviation of the feature
in the training course.
Z-score normalization ensures that the collection of each feature's values has zero mean and unit standard deviation over the training section of the training corpus. This does not guarantee zero means and unit standard deviation over the test section of the corpus.
If a feature is unseen or has zero standard deviation in the training corpus, it is removed from all output. A feature only has zero standard deviation if it has the same value every time it occurs. For instance, all features seen only once will have zero variance. Effectively, features which always have the same value in the training set will be eliminated from future consideration.
A length-norm feature extractor is serializable if its base feature extractor is serializable.
| Constructor Summary | |
|---|---|
ZScoreFeatureExtractor(Corpus<ObjectHandler<Classified<E>>> corpus,
FeatureExtractor<? super E> extractor)
Construct a z-core feature extractor from the specified base feature extractor and the training section of the supplied corpus. |
|
| Method Summary | |
|---|---|
Map<String,? extends Number> |
features(E in)
Return the feature map resulting from converting the feature map produced by the underlying feature extractor to z-scores. |
Set<String> |
knownFeatures()
Returns an unmodifiable view of the known features for this z-score feature extractor. |
double |
mean(String feature)
Returns the mean for the specified feature, or Double.NaN if the feature is not known. |
double |
standardDeviation(String feature)
Returns the standard deviation for the specified feature, or Double.NaN if the feature is not known. |
String |
toString()
Returns a string representation of this z-score feature extractor, listing the mean and deviation for each feature. |
double |
zScore(String feature,
double value)
Return the z-score for the specified feature and value. |
| Methods inherited from class com.aliasi.features.FeatureExtractorFilter |
|---|
baseExtractor |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public ZScoreFeatureExtractor(Corpus<ObjectHandler<Classified<E>>> corpus,
FeatureExtractor<? super E> extractor)
throws IOException
extractor - Base feature extractor.corpus - The corpus whose training section will be visited
IOException - If there is an I/O error visting the corpus.| Method Detail |
|---|
public Map<String,? extends Number> features(E in)
features in interface FeatureExtractor<E>features in class FeatureExtractorFilter<E>in - Input object.
public double zScore(String feature,
double value)
feature - Feature name.value - Value of feature.
public double mean(String feature)
Double.NaN if the feature is not known.
feature - Feature whose mean is returned.
public double standardDeviation(String feature)
Double.NaN if the feature is not known.
feature - Feature whose standard deviation is returned.
public Set<String> knownFeatures()
public String toString()
toString in class Object
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||