com.aliasi.stats
Class AbstractDiscreteDistribution

java.lang.Object
  extended by com.aliasi.stats.AbstractDiscreteDistribution
All Implemented Interfaces:
DiscreteDistribution
Direct Known Subclasses:
BinomialDistribution, MultivariateDistribution, PoissonDistribution, ZipfDistribution

public abstract class AbstractDiscreteDistribution
extends Object
implements DiscreteDistribution

An AbstractDiscreteDistribution provides a default abstract implementation of discrete distributions. Concrete subclasses need only implement the probability(long) method, which returns the probability for each outcome.

The method minOutcome() and maxOutcome() bound the range of non-zero probabilities. They default to Long.MIN_VALUE and Long.MAX_VALUE respectively. Concrete subclasses should implement the tightest possible bounds for these methods, because cumulative probabilities, means, variances and entropies are implemented by looping between the minimum and maximum values and evaluating the probability at each point.

Since:
LingPipe2.0
Version:
2.0
Author:
Bob Carpenter

Constructor Summary
AbstractDiscreteDistribution()
          Construct an abstract discrete distribution.
 
Method Summary
 double cumulativeProbability(long lowerBound, long upperBound)
          Returns the cumulative probability of all outcomes between the specified bounds, inclusive.
 double cumulativeProbabilityGreater(long lowerBound)
          Returns the cumulative probability of all outcomes greater than or equal to the specified lower bound.
 double cumulativeProbabilityLess(long upperBound)
          Returns the cumulative probability of all outcomes less than or equal to the specified upper bound.
 double entropy()
          Returns the entropy of this distribution in bits (log 2).
 double log2Probability(long outcome)
          Returns the log (base 2) probability of the specified outcome.
 long maxOutcome()
          Returns the maximum outcome with non-zero probability for this distribution.
 double mean()
          Returns the mean of this distribution.
 long minOutcome()
          Returns the minimum outcome with non-zero probability for this distribution.
abstract  double probability(long outcome)
          Returns the probability of the specified outcome in this distribution.
 double variance()
          Returns the variance of this distribution.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractDiscreteDistribution

public AbstractDiscreteDistribution()
Construct an abstract discrete distribution.

Method Detail

probability

public abstract double probability(long outcome)
Returns the probability of the specified outcome in this distribution. This abstract method is the only one that needs to be implemented by subclasses, though most will also override the minimum and maximum outcome methods.

Specified by:
probability in interface DiscreteDistribution
Parameters:
outcome - Outcome whose probability is returned.
Returns:
Probability of specified outcome.

cumulativeProbabilityLess

public double cumulativeProbabilityLess(long upperBound)
Returns the cumulative probability of all outcomes less than or equal to the specified upper bound. This method is implemented by looping over all values within the specified range and within the minimum and maximum outcome bounds.

Specified by:
cumulativeProbabilityLess in interface DiscreteDistribution
Parameters:
upperBound - Upper bound of outcome.
Returns:
The cumulative probability of all outcomes less than or equal to the specified upper bound.

cumulativeProbabilityGreater

public double cumulativeProbabilityGreater(long lowerBound)
Returns the cumulative probability of all outcomes greater than or equal to the specified lower bound. This method is implemented by looping over all values within the specified range and within the minimum and maximum outcome bounds.

Specified by:
cumulativeProbabilityGreater in interface DiscreteDistribution
Parameters:
lowerBound - Lower bound on outcomes.
Returns:
The cumulative probability of all outcomes less than or equal to the specified upper bound.

cumulativeProbability

public double cumulativeProbability(long lowerBound,
                                    long upperBound)
Returns the cumulative probability of all outcomes between the specified bounds, inclusive. This method is implemented by looping over all outcomes within range of the specified bounds and within the minimum and maximum outcomes for this distribution.

Specified by:
cumulativeProbability in interface DiscreteDistribution
Parameters:
lowerBound - Lower bound of outcome set.
upperBound - Upper bound of outcome set.
Returns:
The cumulative probability of all outcomes between the bounds, inclusive.

log2Probability

public double log2Probability(long outcome)
Returns the log (base 2) probability of the specified outcome. Implemented by taking the log of the probability estimate.

Specified by:
log2Probability in interface DiscreteDistribution
Parameters:
outcome - Outcome whose log probability is returned.
Returns:
Log (base 2) probability of the specified outcome.

minOutcome

public long minOutcome()
Returns the minimum outcome with non-zero probability for this distribution. Implemented to return the constant Long.MIN_VALUE. If possible, concrete subclasses should override this method with a tighter bound.

Specified by:
minOutcome in interface DiscreteDistribution
Returns:
The minimum outcome for this distribution.

maxOutcome

public long maxOutcome()
Returns the maximum outcome with non-zero probability for this distribution. Implemented to return the constant Long.MAX_VALUE. If possible, concrete subclasses should override this method with a tighter bound.

Specified by:
maxOutcome in interface DiscreteDistribution
Returns:
The maximum outcome for this distribution.

mean

public double mean()
Returns the mean of this distribution. This is implemented as a weighted sum of probabilities over the outcomes within the minimum and maximum for this distribution.

Specified by:
mean in interface DiscreteDistribution
Returns:
Mean of this distribution.

variance

public double variance()
Returns the variance of this distribution. This is implemented by first computing the mean and then looping over outcomes between the minimum and maximum and summing the squared differences between outcomes and the mean, weighted by outcome probability.

Specified by:
variance in interface DiscreteDistribution
Returns:
Variance of this distribution.

entropy

public double entropy()
Returns the entropy of this distribution in bits (log 2). Recall that entropy in bits (base 2) is defined by:
H(P) = - Σx P(x) * log2 P(x)
This method is implemented by iterating over the outcomes between the minimum and maximum and summing their negative probability weighted log probabilities.

Specified by:
entropy in interface DiscreteDistribution
Returns:
The entropy of this distribution.