Markov chain Monte Carlo (abbreviated MCMC) is a way to use computer simulation to approximate the area under a probability density curve. MCMC is very popular in Bayesian statistics because the answer to every question in this field requires finding an area under a posterior probability density. Normally, the model being used has many dimensions, which means we are really finding a volume, but it is useful to start by considering simple examples involving only a single unknown parameter, in which case the volume is tantamount to the area under the posterior density curve.

As an example, consider flipping a coin and making inferences concerning the parameter p, the probability that the coin lands heads-up on any given flip. Being a probability, p must be greater than (or equal to) zero, and less than (or equal to) 1. The figure on the right shows the prior probability density (dotted black line) and the corresponding posterior probability density (solid blue line) for the case in which y=6 heads were observed out of n=10 flips. The prior and posterior are both Beta distributions in this case. The prior is Beta(2,2), which is vague but concentrates the probability mass around the value p = 0.5, reflecting our prior belief that the coin is reasonably fair. The posterior is Beta(2+y,2+n-y). The fact that it is more concentrated (has a smaller variance) than the prior indicates that the data carry some information about the unknown value p.

Suppose we define “fair” to mean that the value of p lies between 0.45 and 0.55. It does not make sense to require p to equal 0.5 exactly, because the probability that p for any coin is exactly 0.5 is zero! Assuming everyone can agree that the interval (0.45,0.55) is a reasonable definition of a fair coin, we can compute the probability that the coin we have just flipped is fair by computing the area under the posterior density curve from 0.45 to 0.55 (this area is shaded in the figure). Any question you have about p can be answered by integrating the posterior density over an interval that corresponds to the question of interest.

This example is simple enough that it can be solved analytically. That is, we can determine that the shaded area of interest is 0.2480408. If the model were more complicated, involving many unknown parameters, it is common that the posterior distribution does not correspond to one of the well-known probability distributions. In such cases, we would be glad to be able to approximate the areas (or volumes) we need. This is where MCMC comes in. MCMC allows us to approximate volumes under very complicated posterior distributions using computer simulation.