|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.classify.LogisticRegressionClassifier<E>
public class LogisticRegressionClassifier<E>
A LogisticRegressionClassifier provides conditional
probability classifications of input objects using an underlying
logistic regression model and feature extractor. Logistic regression
is a discrimitive classifier which operates over arbitrary
floating-point-valued features of objects
Logistic regression classifiers may be trained from a data
corpus using the method train(FeatureExtractor,Corpus,int,boolean,RegressionPrior,AnnealingSchedule,double,int,int,PrintWriter),
the last six arguments of which are shared with the logistic
regression training method LogisticRegression.estimate(Vector[],int[],RegressionPrior,AnnealingSchedule,double,int,int,PrintWriter).
The first three arguments are required to adapt logistic regression
to general classification, and consist of a feature extractor, a
corpus to train over, and a boolean flag indicating whether or not
to add an intercept feature to every input vector.
This class merely acts as an adapter to implement the Classifier interface based on the LogisticRegression class
in the statistics package. The basis of the adaptation is a
general feature extractor, which is an instance of FeatureExtractor. A feature extractor converts an arbitrary input
object (whose type is specified generically in this class) to a
mapping from features (represented as strings) to values
(represented as instances of Number). The class then uses
a symbol table for features to convert the maps from feature names
to numbers into sparse vectors, where the dimensions are the
identifiers for the features in the symbol table. By convention,
if the intercept feature flag is set, it will set dimension 0 of
all inputs to 1.0.
For more information on the logistic regression model itself and
the training procedure used, see the class documentation for LogisticRegression.
This class implements both Serializable and Compilable, but both do the same thing and simply write the
content of the model to the object output. The model read back in
will be an instance of LogisticRegressionClassifier with
the same components as the model that was serialized or compiled.
| Method Summary | ||
|---|---|---|
List<String> |
categorySymbols()
Returns the category symbols used by this classifier. |
|
ConditionalClassification |
classify(E in)
Return the conditional classification of the specified object using logistic regression classification. |
|
void |
compileTo(ObjectOutput objOut)
Compile this classifier to the specified object output. |
|
SymbolTable |
featureSymbolTable()
Returns an unmodifiable view of the symbol table used for features in this classifier. |
|
ObjectToDoubleMap<String> |
featureValues(String category)
Returns a mapping from features to their parameter values for the specified category. |
|
String |
toString()
Returns a string-based representation of this classifier, listing the parameter vectors for each category. |
|
static
|
train(FeatureExtractor<? super F> featureExtractor,
Corpus<ClassificationHandler<F,Classification>> corpus,
int minFeatureCount,
boolean addInterceptFeature,
RegressionPrior prior,
AnnealingSchedule annealingSchedule,
double minImprovement,
int minEpochs,
int maxEpochs,
PrintWriter progressWriter)
Returns a trained logistic regression classifier given the specified feature extractor, corpus, model priors and search parameters. |
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Method Detail |
|---|
public SymbolTable featureSymbolTable()
public List<String> categorySymbols()
public ConditionalClassification classify(E in)
classify in interface Classifier<E,ConditionalClassification>in - Input object to classify.
public void compileTo(ObjectOutput objOut)
throws IOException
Object.equals() sense).
Serializing this class produces exactly the same output.
compileTo in interface CompilableobjOut - Object output to which this classifier is
written.
IOException - If there is an underlying I/O error
writing the model to the stream.public ObjectToDoubleMap<String> featureValues(String category)
category - Classification category.
IllegalArgumentException - If the category is unknown.public String toString()
toString in class Object
public static <F> LogisticRegressionClassifier<F> train(FeatureExtractor<? super F> featureExtractor,
Corpus<ClassificationHandler<F,Classification>> corpus,
int minFeatureCount,
boolean addInterceptFeature,
RegressionPrior prior,
AnnealingSchedule annealingSchedule,
double minImprovement,
int minEpochs,
int maxEpochs,
PrintWriter progressWriter)
throws IOException
Only the training section of the specified corpus is used for training.
See the class documentation above and the class
documentation for LogisticRegression for more
information on the parameters.
featureExtractor - Converter from objects to feature maps.corpus - Corpus of training data.minFeatureCount - Minimum count for features in corpus to
keep feature as part of model.addInterceptFeature - A flag set to true if
an intercept feature should be added to each input vector.prior - The prior for regularization of the regression.annealingSchedule - Class to compute learning rate for each epoch.minImprovement - Minimum relative improvement in error during
an epoch to stop search.minEpochs - Minimum number of search epochs.maxEpochs - Maximum number of epochs.progressWriter - Writer to which progress reports are written.
and checks for termination.
IOException - If there is an underlying I/O exception
reading the data from the corpus.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||