There is a lot of literature out there about Markov chain Monte Carlo (MCMC) convergence diagnostics, including the most popular Gelman-Rubin diagnostic. However, all of these assess the convergence of the Markov chain, and thus address the question of burn-in.
Once I have figured out burn-in, how should I decide how many MCMC samples are enough to continue with my estimation process? Most of the papers using MCMC mention that they ran the Markov chain for some $n$ iterations, but do not say anything about why/how they chose that number, $n$.
In addition, one desired sample size cannot be the answer for all samplers, since the correlation in the Markov chain varies greatly from problem to problem. So is there a rule out there to find out the number of samples required?
Best Answer
How many samples (post burn-in) that you need depends on what you are trying to do with those samples and how your chain mixes.
Typically we are interested in posterior expectations (or quantiles) and we approximate these expectations by averages of our posterior samples, i.e. $$ E[h(theta)|y] approx frac{1}{M} sum_{m=1}^M h(theta^{(m)}) = E_M $$ where $theta^{(m)}$ are samples from your MCMC. By the Law of Large Numbers the MCMC estimate converges almost surely to the desired expectation.
But to address the question of how many samples we need to be assured that we are close enough to the desired expectation, we need a Central Limit Theorem (CLT) result, i.e. something like $$ frac{E_M -E[h(theta)|y]}{sqrt{M}} stackrel{d}{to} N(0,v_h^2) $$ With this CLT we could then make probabilitic statements such as "there is 95% probability that $E[h(theta)|y]$ is between $E_M pm 1.96 v_h$." The two issues here are
- Does the CLT apply?
- How can we estimate $v_h^2$.
We have some results on when the CLT applies, e.g. discete state chains, reversible chains, geometrically ergodic chains. See Robert and Casella (2nd ed) section 6.7.2 for some results in this direction. Unfortunately the vast majority of Markov chains that are out there have no proof that CLT exists.
If a CLT does exist, we still need to estimate the variance in the CLT. One way to estimate this variance involves breaking the chain up into blocks, see Gong and Flegal and references therein. The method has been implemented in the R package mcmcse
with the functions mcse
and mcse.q
to estimate the variance for expectations and quantiles.
Similar Posts:
- Solved – MCMC methods – burning samples
- Solved – How to use draws from a Markov chain (Monte Carlo) for (bayesian) inference
- Solved – the connection between Markov chain and Markov chain monte carlo
- Solved – When we run many chains at once in an MCMC model, how are they combined together for the posterior draws
- Solved – Is Markov chain based sampling the “best” for Monte Carlo sampling? Are there alternative schemes available