

PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 
java.lang.Object com.aliasi.classify.BigVectorClassifier
public class BigVectorClassifier
A BigVectorClassifier
provides an efficient linear
classifier implementation for large numbers of categories.
Inputs are vector implementations and outputs are scored
classifications pruned to the top N.
This class reverses what's typically a category (row) dominant approach to a feature (column) dominant representation, allowing scaling to large number of categories when the columns are sparse.
The standard approach in linear classifiers is to multiply a (possibly sparse) input vector by each category's vector representation. The vector representing a category maps features to values, and may be sparse.
This class reverses the representation. Rather than a map from categories to features to values, it uses a map from features to categories to values. For a sparse input, it then iterates over the categories for each feature and adds the results. If the maps from categories to values for features are very sparse, this saves significant time over multiplying the input by each category's vector representation.
This class uses a custom heap to efficiently merge the features for each category, and a bounded priority queue for collecting nbest results.
There are no training methods provided as part of this class. It is meant as a general utility for importing large category linear classifiers.
Constructor Summary  

BigVectorClassifier(Vector[] termVectors,
int maxResults)
Construct a big vector classifier with the specified term vectors, maximum number of results, and categories equal to the string representations of the category identifiers. 

BigVectorClassifier(Vector[] termVectors,
String[] categories,
int maxResults)
Construct a big vector classifier with the specified term vectors, categories, and maximum number of results. 
Method Summary  

ScoredClassification 
classify(Vector x)
Return a scored classification consisting of the top results for the specified vector input. 
int 
maxResults()
Return the maximum number of top results returned by this classifier. 
void 
setMaxResults(int maxResults)
Sets the maximum number of results returned by this classifier. 
Methods inherited from class java.lang.Object 

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 
Constructor Detail 

public BigVectorClassifier(Vector[] termVectors, int maxResults)
See BigVectorClassifier(Vector[],String[],int)
for
more information.
termVectors
 Term vectors for classifier.maxResults
 Maximum number of top results returned.public BigVectorClassifier(Vector[] termVectors, String[] categories, int maxResults)
termVectors
 Term vectors for classifier.categories
 Category names indexed by number.maxResults
 Maximum number of top results returned.Method Detail 

public int maxResults()
public void setMaxResults(int maxResults)
This method is a write method which should be readwrite
synchronized with calls to classify(Vector)
.
maxResults
 Maximum number of top results returned
by this classifier.public ScoredClassification classify(Vector x)
The maximum size of the returned scored classification is
given by maxResults()
and set with setMaxResults(int)
.
classify
in interface BaseClassifier<Vector>
classify
in interface Classifier<Vector,ScoredClassification>
classify
in interface RankedClassifier<Vector>
classify
in interface ScoredClassifier<Vector>
x
 Vector to classify.


PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 