com.aliasi.stats
Class PotentialScaleReduction

java.lang.Object
  extended by com.aliasi.stats.PotentialScaleReduction

public class PotentialScaleReduction
extends Object

The PotentialScaleReduction class provides an online computationa of Rhat, the potential scale reduction statistic for measuring mixing and convergence of multiple Markov chain Monte Carlo (MCMC) samplers.

At construction time, the number of estimators is specified. There must be at least two estimators in order to compute Rhat. Samples from the Markov chains are provided to this class via the update(int,double) method.

Normality Assumptiosn

These estimates make nomality assumptions for the samples which are not justified at samller sample sizes for all ditribution shapes. It may help to transform samples on a [0,1] scale using the inverse logistic (logit) transform, and samples representing ratios in [0,infinity) using a log transform.

Definition of Rhat

The idea is to compare the cross-chain variances to the within-chain variances, with values near 1.0 indicating good mixing, and values greater than 1 indicating the potential for better mixing. Suppose we have M Markov chains with N samples each, with a sample being a floating point value y[m,n]. As usual, y[m,] is the sequence of samples from the single chain m and y[,n] are the n-th samples from each chain. Unbiased mean and variance estimates are defined for sequences in the usual way (see OnlineNormalEstimator for definitions). Using vector notation, mean'(y[,]) is the average value of all samples whereas mean'(y[m,]) is the average of samples from chain m; similarly var'(y[,]) is the variance over all samples and var'(y[m,]) the variance of samples in chain m.

The definition of the Rhat is:

 Rhat = sqrt(varHatPlus/W)
where varHatPlus is a weighted average of the within-chain (W) and between-chain (B) variances.
 varHatPlus = (N-1)/N * W + 1/N * B.
The between-chain variance is defined by
 B = N * var'(mean'(y[m,]))

   = N/(M-1) * Σm (mean'(y[m,]) - mean'(y[,]))2.
The within-chain variance is the average of the unbiased within-chain variance estimates:
 W = mean'(var'(y[m,]))

   = 1/M Σm var'(y[m,]).
This is the usual definition for chains in which there are the same number of samples. For the implementation here, we take N to be the minimum of the numbers of samples in the chains. The within-chain statistics mean'(y[m,]) and var'(y[m,]) are computed using all of the samples for chain m. But the cross-chain statistics are not normalized, so mean'(y[,]) is computed here as mean'(mean'(y[m,])).

Per-Chain and Global Statistics

The estimators for the within-chain means and variances, mean'(y[m,]) and var'(y[m,]), are available through the estimator returned by estimator(int).

An estimator for the complete set of samples mean and variance, mean'(y[,]) and var'(y[,]), are available through globalEstimator(). Note that these are truly global estimates, not the estimates used in asynchronous Rhat calculations as defined in he previous section.

Thread Safety

Updates are write operations that need to be read-write synchronized with the estimator methods.

References

The method was introduced by Gelman and Rubin in 1992, summarized in their book, and implemented in the R coda package.

Since:
Lingpipe3.9.1
Version:
3.9.1
Author:
Bob Carpenter

Constructor Summary
PotentialScaleReduction(double[][] yss)
          Construct a potential scale reduction for the specified matrix Of estimates for each chain.
PotentialScaleReduction(int numChains)
          Construct a potential scale reduction with the specified number of Markov chains for input.
 
Method Summary
 OnlineNormalEstimator estimator(int chain)
          Returns the estimator for the specified chain.
 OnlineNormalEstimator globalEstimator()
          Returns the estimator that pools the estimates across the chains.
 int numChains()
          Returns the number of chains for this estimator.
 double rHat()
          Returns the Rhat statistic as defined in the class documentation.
 void update(int chain, double y)
          Provide a sample of the specified value from the specified chain.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PotentialScaleReduction

public PotentialScaleReduction(int numChains)
Construct a potential scale reduction with the specified number of Markov chains for input.

Parameters:
numChains - Number of Markov chains.
Throws:
IllegalArgumentException - If the number of chains is less than 2.

PotentialScaleReduction

public PotentialScaleReduction(double[][] yss)
Construct a potential scale reduction for the specified matrix Of estimates for each chain. The matrix entries yss[m][n] are for the n-th sample from chain m. The chains may have different numbers of samples.

Parameters:
yss - Matrix of estimates by chain and sample.
Throws:
IllegalStateException - If the number of chains (length of yss) is less than 2.
Method Detail

numChains

public int numChains()
Returns the number of chains for this estimator.

Returns:
Number of chains for this estimator.

estimator

public OnlineNormalEstimator estimator(int chain)
Returns the estimator for the specified chain.

Parameters:
chain - Index of chain.
Returns:
Estimator for the chain.

globalEstimator

public OnlineNormalEstimator globalEstimator()
Returns the estimator that pools the estimates across the chains.

Returns:
Overall estimator for samples.

update

public void update(int chain,
                   double y)
Provide a sample of the specified value from the specified chain.

Parameters:
chain - Chain from which sample was drawn.
y - Value of sample.
Throws:
IndexOutOfBoundsException - If the chain is less than zero or greater than or equal to the number of chains.

rHat

public double rHat()
Returns the Rhat statistic as defined in the class documentation.

Returns:
The Rhat statistic.