com.aliasi.classify
Class PerceptronClassifier<E>

java.lang.Object
  extended by com.aliasi.classify.PerceptronClassifier<E>
Type Parameters:
E - the type of object being classified
All Implemented Interfaces:
BaseClassifier<E>, Classifier<E,ScoredClassification>, RankedClassifier<E>, ScoredClassifier<E>, Serializable

public class PerceptronClassifier<E>
extends Object
implements Classifier<E,ScoredClassification>, ScoredClassifier<E>, Serializable

A PerceptronClassifier implements a binary classifier based on an averaged kernel-based perceptron. These classifiers are large margin (discriminitive) linear classifiers in a feature space expanded by a plug-and-play kernel implemeting KernelFunction.

A perceptron classifier may be applied to any type of object. An appropriately typed FeatureExtractor is used to map these objects to feature vectors for use in the perceptron classifier.

Corpus Training

Unlike the language-model-based classifiers, for which training, classification and compilation may be interleaved, averaged perceptron-based classifiers require batch training. This requires the entire training corpus to be available in one shot. In particular, training will iterate over a fixed corpus multiple times before producing a completely trained classifier.

The constructor will do the training using the supplied instance of Corpus. The constructor will store the entire corpus in memory in the form of SparseFloatVector and boolean polarities for the accept/reject decision. The corpus will only be held locally in the constructor; it is available for garbage collection, as are all intermediate results, as soon as the constructor is done training.

Kernel Function

The basic (non-kernel) perceptron is equivalent to using the kernel function DotProductKernel. A good choice for most classification tasks is the polynomial kernel, implemented in PolynomialKernel. Usually, higher polynomial kernel degrees perform dramatically better than dot products. 3 is a good general starting degree, as sometimes performance degrades in higher kernel degrees. In some cases, the Gaussian radial basis kernel implemented in GaussianRadialBasisKernel works well.

If the kernel function is neither serializable nor compilable, then the resulting perceptron classifier will not be serializable.

Training Iterations

More training iterations are usually better for accuracy. As more basis vectors are added to the perceptron, more memory is needed for the model and more time is needed for classification. Typically, the amount of memory required at run time will stabilize after a few training iterations.

Memory Usage: Training and Compiled

The memory usage of a perceptron classifier may be very high. The perceptron must store every feature vector in the input on which the classifier being trained made a mistake in some iteration. During training, every feature vector in the input corpus must be stored. These are stored in memory as instances of SparseFloatVector.

If the data are linearly separable in the kernel space, the training process will converge to the point where no additional basis feature vectors are needed and the result will converge to using the single final perceptron (which requires all the intermediate to be stored given the demands of a non-linear kernel calculation).

Serialization

After a perceptron classifier is constructed, it may be serialized to an object output stream. If the underlying feature extractor is compilable, it's compiled, but if it's not compilable, it's serialized. To be serializable, a perceptron classifier requires both its feature extractor and kernel function to be Serializable.

The object read back in after serialization will be an instance of PerceptronClassifier.

About Averaged Kernel Perceptrons

Perceptrons are a kind of large-margin linear classifier. The polynomial kernel trick is used to embed the basic feature vector in a higher-degree vector space in which the data are more separable.

An average of all of the perceptrons created during training is used for the final prediction. The factored formulation of the algorithm allows the perceptrons to be expressed as linearly weighted training samples.

Although theoretical bounds are almost equivalent, in practice averaged perceptrons slightly underperform support vector machine (SVM) learners over the same polynomial kernels. The advantage of perceptrons is that they are much more efficient in time and slightly more efficient in space in practice.

Averaged Perceptron Model with Polynomial Kernel

The model used for runtime predictions by the averaged perceptron is quite straightforward, consisting of a set of weighted feature vectors (represented as parallel arrays of basis vectors and weights) and a kernel degree:

 Vector[] basisVectors;
 int[] weights;
 int degree;
The basis vectors are all vectors derived from single training examples by the specified feature extractor. The weights may be positive or negative and represent the cumulative voted weight of the specified basis vector.

The kernel function computes a distance kernel(v1,v2) between vectors v1 and v2 in an enhanced feature space defined by the particular kernel employed.

A new input to classify is first converted to a feature vector by the feature extractor. Classification is then based on the sign of the following score:

 score(Vector v) = Σi weights[i] * kernel(basisVectors[i],v)
An example is accepted if the score of its feature vector is greater than zero and rejected otherwise.

Estimating the Perceptron Model

To estimate the perceptron model, we will assume that we have a training corpus consisting of an array of vectors with boolean polarities indicating whether they are positive (to accept) or negative (to reject) examples. We also assume we have a fixed kernel function. The training method iterates over the corpus a specified number of times.
 Vector[] basisVectors;
 int[] incrementalWeights;
 boolean[] polarities;
 int degree;
 int index = -1;
 for (# iterations)
     for (vector,polarity) in training corpus
         yHat = scoreIntermediate(vector);
         if (yHat > 0 && polarity) || (yHat < 0 && !polarity)
             ++incrementalWeights[index];
         else
             ++index;
             basisVectors[index] = vector;
             polarities[index] = polarity;
             incrementalWeights[index] = 1;
 scoreIntermediate(vector)
   = Σi <= index polarities[i] * kernel(basisVectors[i],vector)
 
The final weight for a vector is the cumulative weight computed as follows:
 cumulativeWeight(j) = Σk >= j incrementalWeights[k]
 
The actual implementations of these methods involve considerably more indirection and index chasing to avoid copies and duplication in the final vectors.

Historical Notes

The averaged kernel perceptron implemented here was introduced in the following paper, which also provides error bounds for learning and evaluations with polynomial kernels of various degrees:

Freund, Yoav and Robert E. Schapire (1999) Large margin classification using the perceptron algorithm. Machine Learning 37(3):277-296.
The basic perceptron model was introduced in:
Block, H.D. (1962) The perceptron: a model for brain functioning. Reviews of Modern Physics 34:123-135.

The kernel-based perceptron was introduced in:

Aizerman, M.A., E.M. Braverman, and L.I. Rozonoer. 1964. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control. 25:821-837.
The basis of the voting scheme is a deterministically averaged version of the randomized approach of adapting online learners to a batch setup described in the following paper:
Helmbold, D.P. and M.K. Warmuth. (1995) On weak learning. Journal of Computer and System Sciences 50:551-573.

Since:
LingPipe3.1
Version:
3.9.1
Author:
Bob Carpenter
See Also:
Serialized Form

Constructor Summary
PerceptronClassifier(Corpus<ObjectHandler<Classified<E>>> corpus, FeatureExtractor<? super E> featureExtractor, KernelFunction kernelFunction, String corpusAcceptCategory, int numIterations, String outputAcceptCategory, String outputRejectCategory)
          Construct a perceptron classifier from the specified feature extractor, corpus with designated accept category, polynomial kernel degree and number of training iterations, and output accept and reject categories.
PerceptronClassifier(FeatureExtractor<? super E> featureExtractor, KernelFunction kernelFunction, Corpus<ClassificationHandler<E,Classification>> corpus, String corpusAcceptCategory, int numIterations)
          Deprecated. Use constructor PerceptronClassifier(Corpus,FeatureExtractor,KernelFunction,String,int,String,String) instead.
PerceptronClassifier(FeatureExtractor<? super E> featureExtractor, KernelFunction kernelFunction, Corpus<ClassificationHandler<E,Classification>> corpus, String corpusAcceptCategory, int numIterations, String outputAcceptCategory, String outputRejectCategory)
          Deprecated. Use constructor PerceptronClassifier(Corpus,FeatureExtractor,KernelFunction,String,int,String,String) instead.
 
Method Summary
 ScoredClassification classify(E in)
          Return the scored classification for the specified input.
 FeatureExtractor<? super E> featureExtractor()
          Returns the feature extractor for this perceptron.
 KernelFunction kernelFunction()
          Returns the kernel function for this perceptron.
 String toString()
          Returns a string-based representation of this perceptron.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PerceptronClassifier

@Deprecated
public PerceptronClassifier(FeatureExtractor<? super E> featureExtractor,
                                       KernelFunction kernelFunction,
                                       Corpus<ClassificationHandler<E,Classification>> corpus,
                                       String corpusAcceptCategory,
                                       int numIterations)
                     throws IOException
Deprecated. Use constructor PerceptronClassifier(Corpus,FeatureExtractor,KernelFunction,String,int,String,String) instead.

Construct a perceptron classifier from the specified feature extractor, corpus with designated accept category, polynomial kernel degree and number of training iterations, using the default runtime accept and reject categories.

The default runtime accept and reject categories are BinaryLMClassifier.DEFAULT_ACCEPT_CATEGORY and BinaryLMClassifier.DEFAULT_REJECT_CATEGORY, respectively.

Parameters:
featureExtractor - Feature extractor for objects.
corpus - Corpus to use for training.
corpusAcceptCategory - Category in training data to treat as positive.
kernelFunction - Kernel function for expanding vector basis.
numIterations - Number of iterations to carry out during training.
Throws:
IOException

PerceptronClassifier

@Deprecated
public PerceptronClassifier(FeatureExtractor<? super E> featureExtractor,
                                       KernelFunction kernelFunction,
                                       Corpus<ClassificationHandler<E,Classification>> corpus,
                                       String corpusAcceptCategory,
                                       int numIterations,
                                       String outputAcceptCategory,
                                       String outputRejectCategory)
                     throws IOException
Deprecated. Use constructor PerceptronClassifier(Corpus,FeatureExtractor,KernelFunction,String,int,String,String) instead.

Construct a perceptron classifier from the specified feature extractor, corpus with designated accept category, polynomial kernel degree and number of training iterations, and output accept and reject categories.

Parameters:
featureExtractor - Feature extractor for objects.
corpus - Corpus to use for training.
corpusAcceptCategory - Category in training data to treat as positive.
kernelFunction - Kernel function for expanding vector basis.
numIterations - Number of iterations to carry out during training.
outputAcceptCategory - Category with which to label accepted instances.
outputRejectCategory - Category with which to label rejected instances.
Throws:
IOException

PerceptronClassifier

public PerceptronClassifier(Corpus<ObjectHandler<Classified<E>>> corpus,
                            FeatureExtractor<? super E> featureExtractor,
                            KernelFunction kernelFunction,
                            String corpusAcceptCategory,
                            int numIterations,
                            String outputAcceptCategory,
                            String outputRejectCategory)
                     throws IOException
Construct a perceptron classifier from the specified feature extractor, corpus with designated accept category, polynomial kernel degree and number of training iterations, and output accept and reject categories.

Parameters:
corpus - Corpus to use for training.
featureExtractor - Feature extractor for objects.
corpusAcceptCategory - Category in training data to treat as positive.
kernelFunction - Kernel function for expanding vector basis.
numIterations - Number of iterations to carry out during training.
outputAcceptCategory - Category with which to label accepted instances.
outputRejectCategory - Category with which to label rejected instances.
Throws:
IOException
Method Detail

kernelFunction

public KernelFunction kernelFunction()
Returns the kernel function for this perceptron.

Returns:
The kernel function for this perceptron.

featureExtractor

public FeatureExtractor<? super E> featureExtractor()
Returns the feature extractor for this perceptron.

Returns:
The feature extractor for this perceptron.

toString

public String toString()
Returns a string-based representation of this perceptron. This may be long, as it outputs every basis vector and weight.

Overrides:
toString in class Object
Returns:
A string-based representation of this perceptron.

classify

public ScoredClassification classify(E in)
Return the scored classification for the specified input. The input is first converted to a feature vector using the feature extractor, then scored against the perceptron. The resulting score for the accept category is the perceptron score, and the resulting score for the reject category is the negative perceptron score.

Specified by:
classify in interface BaseClassifier<E>
Specified by:
classify in interface Classifier<E,ScoredClassification>
Specified by:
classify in interface RankedClassifier<E>
Specified by:
classify in interface ScoredClassifier<E>
Parameters:
in - The element to be classified.
Returns:
The scored classification for the specified element.