|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.classify.BernoulliClassifier<E>
E - the type of object classifiedpublic class BernoulliClassifier<E>
A BernoulliClassifier provides a feature-based
classifier where feature values are reduced to booleans based on a
specified threshold. Training events are supplied in the usual
way through the handle(Object,Classification) method.
Given a feature threshold of t, any feature with
value strictly greater than the threshold t for a
given input is activated, and all other features are not activated
for that input.
The likelihood of a feature in a category is estimated with the
training sample counts using add-one smoothing (also known as
Laplace smoothing, or a uniform Dirichlet prior). There is also
a term for the category distribution. Suppose F is
the complete set of features seen during training. Further suppose
that count(cat) is the number of training samples
for category cat, and count(cat,feat)
is the number of training instaces of the specified category that
had the specified feature activated. Thus the contribution of
each feature is computed by:
p(+feat|cat) = (count(cat,feat) + 1) / (count(cat)+2) p(-feat|cat) = 1.0 - p(cat,feat)
Assuming the total number of training instances is totalCount,
we use a simple maximum-likelihood estimate for the category probability:
p(cat) = count(cat) / totalCountWith these two definitions, we define the joint probability estimate for a category
cat given activated features
{f[0],...,f[n-1]} and unactivated features
{g[0],...,g[m-1]} is:
p(cat,{f[0],...f[n-1]})
= p(cat)
* Πi < n p(f[i]|cat)
* Πj < m p(-g[j]|cat)
The JointClassification class requires log (base 2) estimates,
and is responsible for converting these to conditional estimates.
The scores in this case are just the log2 joint estimates.
The dynamic form of the estimator may be used for classification, but it is not very efficient. It loops over every feature for every category.
The serialized version of a Bernoulli classifier will
deserialize as an equivalent instance of
BernoulliClassifier. In order to serialize a
Bernoulli classifier, the feature extractor must be serializable.
Otherwise an exception will be raised during serialization.
Compilation is not yet implemented.
| Constructor Summary | |
|---|---|
BernoulliClassifier(FeatureExtractor<E> featureExtractor)
Construct a Bernoulli classifier with the specified feature extractor and the default feature activation threshold of 0.0. |
|
BernoulliClassifier(FeatureExtractor<E> featureExtractor,
double featureActivationThreshold)
Construct a Bernoulli classifier with the specified feature extractor and specified feature activation threshold. |
|
| Method Summary | |
|---|---|
String[] |
categories()
Returns the categories for this classifier. |
JointClassification |
classify(E input)
Classify the specified input using this Bernoulli classifier. |
void |
handle(E input,
Classification classification)
Handle the specified training event, consisting of an input and its first-best classification. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public BernoulliClassifier(FeatureExtractor<E> featureExtractor)
featureExtractor - Feature extractor for classification.
public BernoulliClassifier(FeatureExtractor<E> featureExtractor,
double featureActivationThreshold)
featureExtractor - Feature extractor for classification.featureActivationThreshold - The threshold for feature
activation (see the class documentation).| Method Detail |
|---|
public String[] categories()
public void handle(E input,
Classification classification)
handle in interface ClassificationHandler<E,Classification>input - Object whose classification result is being
trained.classification - Classification result for object.public JointClassification classify(E input)
classify in interface Classifier<E,JointClassification>input - Input to classify.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||