com.aliasi.stats
Class MultivariateEstimator

java.lang.Object
  extended by com.aliasi.stats.AbstractDiscreteDistribution
      extended by com.aliasi.stats.MultivariateDistribution
          extended by com.aliasi.stats.MultivariateEstimator
All Implemented Interfaces:
DiscreteDistribution, Serializable

public class MultivariateEstimator
extends MultivariateDistribution
implements Serializable

A MultivariateEstimator provides a maximum likelihood estimator of a multivariate distribution based on training samples. Training is carried out by incrementing outcomes through train(String,long). At any point, the distribution provides a maximum likelihood estimator.

Simple additive smoothing can be achieved through the API by initially incrementing counts for all possible outcomes by one.

Compilation and Serialization

Serialization simply stores the current multivariate estimator and reconstructs it exactly as is under deserialization (that is, the class of the deserialized object is MultivariateEstimator). Compilation stores a more efficient and compact version of the estimator, which deserializes to a MultivariateDistribution rather than a MultivariateEstimator.

Since:
LingPipe2.0
Version:
3.8
Author:
Bob Carpenter
See Also:
Serialized Form

Constructor Summary
MultivariateEstimator()
          Construct a multivariate estimator with no known outcomes or counts.
 
Method Summary
 void compileTo(ObjectOutput objOut)
          Writes a constant version of this estimator to the specified object output.
 long getCount(long outcome)
          Returns the count in this estimator for the specified outcome.
 long getCount(String outcomeLabel)
          Returns the count for the specified outcome.
 String label(long outcome)
          Return the label for the specified outcome.
 int numDimensions()
          Returns the number of dimensions for this multivariate estimator.
 long outcome(String outcomeLabel)
          Return the outcome for the specified label.
 double probability(long outcome)
          Returns the multivariate probability estimate for the specified outcome.
 void resetCount(String outcomeLabel)
          Resets the count for the specified outcome label to zero.
 void train(String outcomeLabel, long increment)
          Increment counts in this estimator for the specified outcome by the specified increment.
 long trainingSampleCount()
          Returns the total count of training sample.
 
Methods inherited from class com.aliasi.stats.MultivariateDistribution
log2Probability, maxOutcome, minOutcome, probability
 
Methods inherited from class com.aliasi.stats.AbstractDiscreteDistribution
cumulativeProbability, cumulativeProbabilityGreater, cumulativeProbabilityLess, entropy, log2Probability, mean, variance
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MultivariateEstimator

public MultivariateEstimator()
Construct a multivariate estimator with no known outcomes or counts.

Method Detail

resetCount

public void resetCount(String outcomeLabel)
Resets the count for the specified outcome label to zero. Calling this method will also decrement the total count for this estimator.

Parameters:
outcomeLabel - Label of outcome that is reset.
Throws:
IllegalArgumentException - If the outcome label is not known.

train

public void train(String outcomeLabel,
                  long increment)
Increment counts in this estimator for the specified outcome by the specified increment.

Parameters:
outcomeLabel - Label of sample outcome.
increment - Amount to increment count for outcome.
Throws:
IllegalArgumentException - If the result would be a count higher than the maximum long value or if the increment is less than one.

outcome

public long outcome(String outcomeLabel)
Return the outcome for the specified label.

Overrides:
outcome in class MultivariateDistribution
Parameters:
outcomeLabel - Label whose outcome is returned.
Returns:
The outcome for the specified label.

label

public String label(long outcome)
Return the label for the specified outcome.

Overrides:
label in class MultivariateDistribution
Parameters:
outcome - Outcome whose label is returned.
Returns:
The label for the specified outcome.

numDimensions

public int numDimensions()
Returns the number of dimensions for this multivariate estimator.

Specified by:
numDimensions in class MultivariateDistribution
Returns:
The number of dimensions for this multivariate estimator.

probability

public double probability(long outcome)
Returns the multivariate probability estimate for the specified outcome.

Specified by:
probability in interface DiscreteDistribution
Specified by:
probability in class MultivariateDistribution
Parameters:
outcome - The outcome whose probability is returned.
Returns:
The probability of the specified outcome.

getCount

public long getCount(long outcome)
Returns the count in this estimator for the specified outcome.

Parameters:
outcome - The outcome whose probability is returned.
Returns:
The probability of the specified outcome.
Throws:
IllegalArgumentException - If the outcome is not between zero and the maximum outcome inclusive.

getCount

public long getCount(String outcomeLabel)
Returns the count for the specified outcome.

Parameters:
outcomeLabel - Label of specified outcome.
Returns:
Count of outcome in this estimator.
Throws:
IllegalArgumentException - If the

trainingSampleCount

public long trainingSampleCount()
Returns the total count of training sample.

Returns:
The total count for this estimator.
Throws:
IllegalArgumentException - If the outcome is not between zero and the maximum outcome inclusive.

compileTo

public void compileTo(ObjectOutput objOut)
               throws IOException
Writes a constant version of this estimator to the specified object output. The distribution read back in will be an instance of MultivariateConstant with the same distribution as the estimated distribution.

Parameters:
objOut - The object output to which this estimator is compiled.
Throws:
IOException - If there is an I/O exception writing to the output.