DEMC procedure

Performs Bayesian computing using the Differential Evolution Markov Chain algorithm (W. van den Berg & R.W. Payne).


Options

PRINT = string
What to print (results, monitoring, scatterplot, histogram); default resu, moni, scat, hist

CALCULATION = expression
Calculation(s) of logposterior, involving explanatory or pointer variate; if unset, this is calculated by the procedure specified by the PROCEDURE option

LOGPOSTERIOR = string
Identifier of log-posterior within CALCULATION (must be set if CALCULATION is set)

MULTIPLE = scalar
Number of populations is number of parameters times MULTIPLE; default 3

UNIFORMLIMIT = scalar
Uniform random numbers are drawn from (-UNIFORMLIMIT, UNIFORMLIMIT) and added to candidate parameter sets; default 0.00001

DATA = identifiers
Data structures used in CALCULATION or by PROCEDURE

NGENERATIONS = scalar
Maximum number of iterations; default 1000

STEP1 = scalar or variate
Generations for which gamma is set to 1; default 0

FRACTIONBURNIN = scalar
Fraction of iterations used for burn-in; default 0.5

GRVARIANCE = scalar or variate
Variance to generate populations from initial values of the parameters; default 0.1

PERCENTAGES = variate
Percentages for which quantiles has to be calculated; default !(2.5, 25, 50, 75, 97.5)

PROCEDURE = identifier
Identifier of procedure to calculate LOGPOSTERIOR if CALCULATION is unset; default _DEMCLOGPOSTERIOR

SEED = scalar
Seed for the random numbers; default 0

NWINDOWS = scalar
Number of histograms and scatterplots per screen when plotting estimates and logposterior from all iterations

SDLOGPOSTERIOR = scalar
Saves the s.d. for LOGPOSTERIOR

QUANTILESLOGPOSTERIOR = variate
Saves quantiles for LOGPOSTERIOR

RHATLOGPOSTERIOR = scalar
Saves the convergence criterion for LOGPOSTERIOR

ALLLOGPOSTERIOR = variate
Saves the parameter estimates for LOGPOSTERIOR from all the iterations

IPOPULATIONS = pointers
Pointer to supply initial populations of the parameters and the corresponding log-posteriors

FPOPULATIONS = pointers
Pointer to save final populations of the parameters and the corresponding log-posteriors


Parameters

PARAMETER = scalars
Parameters to estimate

INITIAL = scalars
Initial values of the parameters; must be set unless IPOPULATIONS is set

SD = scalars
Standard errors of the estimates

QUANTILES = variates
Saves the quantiles for each parameter

RHAT = scalars
Convergence criteria

ALLESTIMATES = variates
Saves the parameter estimates from all the iterations


Description

DEMC uses the Differential Evolution Markov Chain algorithm of Ter Braak (2006) to do Bayesian computations by Markov chain Monte Carlo. The logarithm of the posterior density for each set of parameters can be calculated either by a list of expressions supplied by the CALCULATION option, or by a (user-defined) procedure whose name is specified by the PROCEDURE option (with default name _DEMCLOGPOSTERIOR). The names of the parameters and their initial values are specified by the PARAMETER and INITIAL parameters, respectively. Data structures containing information that is needed to calculate the log-posterior are supplied by the DATA option. Also, if you are using the CALCULATION option, you must define the identifier of the log-posterior (as used to store the results of the calculations) using the LOGPOSTERIOR option.

   The number of populations of parameters to be generated is defined as the number of parameters multiplied by the value supplied by the MULTIPLE option (default 3). The Normal variance used to generate the initial population from the initial values is specified by the GRVARIANCE option. You can set this to a scalar to use the same variance for each parameter, or to a variate to define different variances for the parameters; by default GRVARIANCE=0.1. The fraction of the data used for burn-in is specified by the FRACTIONBURNIN option (default 0.5).

   The NGENERATIONS option defines the number of generations to form from the populations, and the FRACTIONBURNIN option defines the proportion of these that are for burn-in. (The distributions of the parameters are determined only from the generations that are produced after burn-in is complete.) The SEED option defines a seed for the random numbers that are used within DEMC. The default value 0 continues from the previous random-number generation or (if none) initializes the seed automatically. Options UNIFORMLIMIT and STEP1, which control how the new populations are formed, are explained in the Method section.

   Once the generations are complete, the identifiers defined by PARAMETER are defined as scalars containing the means of the parameters over the populations generated after burn-in. Standard deviations and convergence criteria for the parameters can be saved, in scalars, using the SD and RHAT parameters. If RHAT is greater than 1.1, say, for any parameter, the number of generations should be increased. The QUANTILES parameter allows to save a variate for each PARAMETER, containing quantiles at percentages specified by the PERCENTAGES option (by default 2.5, 25, 50, 75, 97.5). To study the parameter distributions in more detail, you can also use the ALLESTIMATES parameter to save variates containing all the values generated after burn-in for each PARAMETER. The LOGPOSTERIOR, SDLOGPOSTERIOR, RHATLOGPOSTERIOR, QUANTILESLOGPOSTERIOR and ALLLOGPOSTERIOR allow the equivalent information to be saved for the log-posterior.

   The final populations and corresponding log-posteriors can be saved, in a pointer, by the FPOPULATIONS option. You can then restart DEMC from the current position, and run some more generations, by using this pointer as the setting of the IPOPULATIONS option. FPOPULATIONS[1...N] have number of units equal to the number of parameters d, while FPOPULATIONS[N1] has number of units equal to N, where N = MULTIPLE × d. This can cause problems if you try to save FPOPULATIONS[] using procedure EXPORT.

 

Options: PRINT, CALCULATION, LOGPOSTERIOR, MULTIPLE, UNIFORMLIMIT, DATA, NGENERATIONS, STEP1, FRACTIONBURNIN, GRVARIANCE, PERCENTAGES, PROCEDURE, SEED, NWINDOWS, SDLOGPOSTERIOR, QUANTILESLOGPOSTERIOR, RHATLOGPOSTERIOR, ALLLOGPOSTERIOR, IPOPULATIONS, FPOPULATIONS.

Parameters: PARAMETER, INITIAL, SD, QUANTILES, RHAT, ALLESTIMATES.


Method

DEMC uses the DE-MC algorithm of Ter Braak (2006) to perform Markov chain Monte Carlo (MCMC); see Congdon (2001, 2003), Gelman et al. (2004) or Lee (2003). The DE-MC algorithm combines the genetic algorithm called Differential Evolution (DE) with MCMC. The values of the INITIAL parameter are used to generate n parameter sets, by generating d independent Normal deviates with means INITIAL and variance GRVARIANCE. Here, d is the number of parameters, and n is d multiplied by the value of the MULTIPLE option.

   For each parameter set i (i=1...n), the algorithm selects two other parameter sets at random, and calculates the differences between their parameter values, multiplied by a parameter γ and a random number taken from the uniform distribution on (-UNIFORMLIMIT, UNIFORMLIMIT); γ generally takes the value 2.38/√(2×d), but the STEP1 option allows you to define generations in which γ takes the value 1 (by default there are none). These differences are then added to the parameter values in set i to form a new candidate set of values. The candidate set replaces set i if its log-posterior likelihood is greater than the log-posterior likelihood of set i + the logarithm of a random number from the uniform distribution on (0,1); see Ter Braak 2006).


References

Congdon, P. (2001). Bayesian Statistical Modelling. Wiley, Chichester, England

Congdon, P. (2003). Applied Bayesian Modelling. Wiley, Chichester, England.

Gelman, A., Carlin, J.B., Stern, H.S. & D.B. Rubin (2004). Bayesian Data Analysis, 2nd Edition. Chapman & Hall, London.

Lee, P.M. (2003). Bayesian Statistics an Introduction, 3rd Edition. Arnold, London.

Ter Braak, C.J.F. (2006) A Markov chain Monte Carlo version of the genetic algorithm Differential Evolution: easy Bayesian computing for real parameter spaces. Statistics & Computing, 16, in press.