com.aliasi.stats

## Class BinomialDistribution

• All Implemented Interfaces:
DiscreteDistribution

```public class BinomialDistribution
extends AbstractDiscreteDistribution```
A `BinomialDistribution` is a discrete distribution over the number of successes given a fixed number of Bernoulli trials. A binomial distribution is constructed from a specified Bernoulli distribution which determines the success probability. The minimum outcome is `0` and the maximum outcome is the number of trials. This class also defines a constant method `log2BinomialCoefficient(long,long)` for computing binomial coefficients.

The method `z(int)` returns the z-score statistic for a specified number of outcomes.

### Computing P-Values

As of LingPipe 3.2.0, the dependency on Jakarta Commons Math was removed. As a result, we removed the two methods that computed p-values. Here's their implementation in case you need the functionality (you may need to increas the text size):

``` import org.apache.commons.math.MathException;
import org.apache.commons.math.distribution.NormalDistribution;
import org.apache.commons.math.distribution.NormalDistributionImpl;

static final NormalDistribution Z_DISTRIBUTION
= new NormalDistributionImpl();

/**
* Returns the two-sided p-value computed from the z-score for
* this distribution for the specified number of successes.
...
double pValue(int numSuccesses) throws MathException {
return pValue(bernoulliDistribution().successProbability(),
numSuccesses,
numTrials());
}

/**
* Returns the one-sided p-value computed from the z-score for
* this distribution for the specified number of successes.
...
double pValueLess(int numSuccesses) throws MathException {
return pValueLess(bernoulliDistribution().successProbability(),
numSuccesses,
mNumTrials());
}

/**
* Returns the two-sided p-value for the z-score statistic on the
* specified number of successes out of the specified number of
* trials for the specified success probability.
...
static double pValue(double successProbability,
int numSuccesses,
int numTrials) throws MathException {

double z = z(successProbability,numSuccesses,numTrials);
return 2.0 * Z_DISTRIBUTION.cumulativeProbability(Math.min(-z,z));
}

/**
* Returns the one-sided (lower) p-value for the z-score statistic
* on the specified number of successes out of the specified
* number of trials for the specified success probability.
...
static double pValueLess(double successProbability,
int numSuccesses,
int numTrials) throws MathException {
double z = z(successProbability,numSuccesses,numTrials);
return 1.0 - Z_DISTRIBUTION.cumulativeProbability(z);
}```

• Eric W. Weisstein. Binomial Distribution. From MathWorld--A Wolfram Web Resource.
• Eric W. Weisstein. Binomial Coefficient. From MathWorld--A Wolfram Web Resource.
• Eric W. Weisstein. z-Score. From MathWorld--A Wolfram Web Resource.
• Eric W. Weisstein. P-Value. From MathWorld--A Wolfram Web Resource.
• Eric W. Weisstein. Hypothesis Testing. From MathWorld--A Wolfram Web Resource.
Since:
LingPipe2.0
Version:
3.2.0
Author:
Bob Carpenter
• ### Constructor Summary

Constructors
Constructor and Description
```BinomialDistribution(BernoulliDistribution distribution, int numTrials)```
Construct a binomial distribution that samples from the specified Bernoulli distribution the specified number of times.
• ### Method Summary

All Methods
Modifier and Type Method and Description
`BernoulliDistribution` `bernoulliDistribution()`
Returns the underlying Bernoulli (two outcome) distribution underlying this binomial distribution.
`static double` ```log2BinomialCoefficient(long n, long m)```
Returns the log (base 2) of the binomial coefficient of the specified arguments.
`double` `log2Probability(long outcome)`
Returns the log (base 2) probability of the specified outcome.
`long` `maxOutcome()`
Returns the maximum non-zero probability outcome, which is the number of trials for this distribution.
`long` `minOutcome()`
Returns zero, the minimum outcome for a binomial distribution.
`long` `numTrials()`
Returns the number of trials for this binomial distribution.
`double` `probability(long outcome)`
Returns the probability of the specified outcome.
`double` `variance()`
Returns the variance of this binomial distribution.
`static double` ```z(double successProbability, int numSuccesses, int numTrials)```
Returns the z score for the specified number of successes out of the specified number of trials given the specified success probability.
`double` `z(int numSuccesses)`
Returns the z-score for the specified number of successes given this distribution's success probability and number of trials.
• ### Methods inherited from class com.aliasi.stats.AbstractDiscreteDistribution

`cumulativeProbability, cumulativeProbabilityGreater, cumulativeProbabilityLess, entropy, mean`
• ### Methods inherited from class java.lang.Object

`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`
• ### Constructor Detail

• #### BinomialDistribution

```public BinomialDistribution(BernoulliDistribution distribution,
int numTrials)```
Construct a binomial distribution that samples from the specified Bernoulli distribution the specified number of times. The resulting distribution is over the number of successes, with a range between zero and the number of trials.

The Bernoulli distribution is stored and any change to it will affect the constructed binomial distribution.

Parameters:
`distribution` - Underlying Bernoulli distribution.
• ### Method Detail

• #### bernoulliDistribution

`public BernoulliDistribution bernoulliDistribution()`
Returns the underlying Bernoulli (two outcome) distribution underlying this binomial distribution.
Returns:
The base distribution.
• #### minOutcome

`public long minOutcome()`
Returns zero, the minimum outcome for a binomial distribution.
Specified by:
`minOutcome` in interface `DiscreteDistribution`
Overrides:
`minOutcome` in class `AbstractDiscreteDistribution`
Returns:
Zero, the minimum outcome for a binomial distribution.
• #### maxOutcome

`public long maxOutcome()`
Returns the maximum non-zero probability outcome, which is the number of trials for this distribution.
Specified by:
`maxOutcome` in interface `DiscreteDistribution`
Overrides:
`maxOutcome` in class `AbstractDiscreteDistribution`
Returns:
The maximum non-zero probability outcome.
• #### numTrials

`public long numTrials()`
Returns the number of trials for this binomial distribution. This is the same as the result of `maxOutcome()`.
Returns:
The number of trials.
• #### probability

`public double probability(long outcome)`
Returns the probability of the specified outcome. The probability is determined by the likelihood of the specified number of successes out of the number of trials for this distribution.

The probability for a specified number of outcomes is:

``` P(numSuccesses)   = binomialCoefficient(numTrials,numSuccesses)   * P(success)n   * (1 - P(success))numTrials - numSuccesses ```
where `numTrials` is the number of trials for this binomial distribution and `P(success)` is the success probability of the Bernoulli distribution underlying this binomial distribution.
Specified by:
`probability` in interface `DiscreteDistribution`
Specified by:
`probability` in class `AbstractDiscreteDistribution`
Parameters:
`outcome` - Number of successes.
Returns:
Probability of specified number of successes.
• #### log2Probability

`public double log2Probability(long outcome)`
Returns the log (base 2) probability of the specified outcome. The probability is determined by the likelihood of the specified number of successes out of the number of trials for this distribution. See the documentation for the method `probability(long)` for an exact definition.
Specified by:
`log2Probability` in interface `DiscreteDistribution`
Overrides:
`log2Probability` in class `AbstractDiscreteDistribution`
Parameters:
`outcome` - Number of successes.
Returns:
Probability of specified number of successes.
• #### z

`public double z(int numSuccesses)`
Returns the z-score for the specified number of successes given this distribution's success probability and number of trials. Z-scores may take on any value from negative to positive infinity. A z-score is the number of standard deviations above or below the expected number of successes for this distribution. Thus the greater the absolute value of the z-score, the less likely the number of successes was drawn from this distribution. The lower a negative z-score, the more likely it was drawn from a distribution with a lower success probability and the higher a positive z-score, the more likely it was drawn from a distribution with a higher success probability.

The formula for z-scores is provided in the documentation for the static method `z(double,int,int)`.

Parameters:
`numSuccesses` - Number of successes in sample.
Returns:
Z score value.
Throws:
`IllegalArgumentException` - If the number of successes is less than 0 or more than the number of trials for this distribution.
• #### variance

`public double variance()`
Returns the variance of this binomial distribution. The variance of a binomial distribution is:
variance = numTrials * P(success) * (1 - P(success))
Specified by:
`variance` in interface `DiscreteDistribution`
Overrides:
`variance` in class `AbstractDiscreteDistribution`
Returns:
The variance of this binomial distribution.
• #### z

```public static double z(double successProbability,
int numSuccesses,
int numTrials)```
Returns the z score for the specified number of successes out of the specified number of trials given the specified success probability. The z-score is the number of standard deviations above or below the median number of outcomes the given number of successes lies given the success probability and number of trials.

The z-score for binomial distributions is defined by:

``` z = (numSuccesses - expectedSuccesses)   / (numTrials * P(success) * (1-P(success)))1/2 ```
where
``` expectedSuccesses = P(success) * numTrials ```
Thus numerator is the difference between observed and expected values for the number of successes and the denominator is the standard deviation for the Bernoulli trial iterated over the specified number of trials.
Parameters:
`successProbability` - Probability of success.
`numSuccesses` - Number of successes.
`numTrials` - Number of trials.
Throws:
`IllegalArgumentException` - If the success probability is not between 0 and 1 or if the number of successes is less than zero or greater than the number of trials.
• #### log2BinomialCoefficient

```public static double log2BinomialCoefficient(long n,
long m)```
Returns the log (base 2) of the binomial coefficient of the specified arguments. The binomial coefficient is equal to the number of ways to choose a subset of size `m` from a set of `n` objects, which is pronounced "n choose m", and is given by:
``` binomialCoefficient(n,m) = n! / ( m! * (n-m)!) log2 choose(n,m) = log2 n - log2 m - log2 (n-m) ```
Returns:
The log (base 2) of the binomial coefficient of the specified arguments.