Tutorial Step 6: Some Things to Try

Here are a few suggested exercises to illustrate various aspects of MCMC. While you are free to try these suggestions in any order, this is the order I normally use when demonstrating the app, and in my experience this ordering seems to work quite well.

Let the robot walk on a landscape with no hills

Without hills, the robot takes a random walk and bounces off the edges of the field like a billiard ball. This is equivalent to exploring a two-dimensional uniform distribution. Any two non-overlapping rectangles of the equal area should contain approximately the same number of points if the robot has taken enough steps.

One hill

Create a single hill in the center of the field. This hill is actually a symmetric bivariate normal density function, so it gradually slopes down all the way to the edges of the field. This explains why even starting in a corner the robot has no trouble finding its way to the hill: uphill steps are always accepted, and any step toward the hill is uphill. This setup also nicely illustrates the concept of burn-in. Some statisticians exclude those initial steps because they clearly are atypical samples from the distribution. Others believe that if you feel the need to exclude some initial points, then you probably haven’t let the robot walk long enough!

If you click on the Stats button after letting the robot walk around awhile, you will get a summary that includes the total number of steps taken, the number of steps inside the inner (50%) contour, and the number of steps inside the outer (95%) contour. The 50% contour contains 50% of the volume under the bivariate normal density (the hill), so ideally the MCMC analysis will show that the approximately 50% of the robots steps have landed inside this circle. The same goes for the outer contour: in this case, 95% of the robots steps should be inside the outer circle. The approximation should be better for long walks compared to short walks.

Small steps

Open the settings and set the mean and standard deviation of the step length to very small values (e.g. set both to 1). Now the robot will take baby steps. Reset by pressing the rewind button (2nd. from right on bottom tool bar) and press the play button (far right on bottom tool bar) a few times. You should notice that the robot takes much longer to reach the hill now because it is taking very tiny steps. Go to the settings again and set “Show fails” to ON. Note that very few steps have been rejected. This demonstrates one way in which MCMC can be inefficient. Taking very tiny steps has the benefit of few rejected proposals, but it nevertheless exhibits poor mixing because the robot requires lots of steps to get anywhere. This does not invalidate MCMC; it merely means that a longer run will be needed to achieve the same effective sample size. The effective sample size equals the actual sample size if the sampled points were drawn directly and independently from the target distribution. In this case, however, the sampled points are highly autocorrelated, and thus the actual sample size is much greater than the effective sample size.

Big steps

Now examine the kind of inefficiency that results from taking steps that are too big. Open the settings and make the mean step length very large (say 150 pixels) while keeping the standard deviation at or near 1. This setting means that the robot will invariably take a very large step, which often fails because taking such a step would move the robot off the hill into the (nearly) flat area around the margins. If you have “Show fails” still turned ON, you will be able to see that many more proposed steps fail to be accepted. Nevertheless, it does not take the robot long to move around the hill, so this “Big Steps” strategy is better (in this case) than the “Small Steps” strategy, but an intermediate step size would obviously be ideal because it would combine larger steps with fewer rejections.

Multiple chains

Create a landscape with two, widely-spaced, relatively-small hills. If you have one hill already defined, you can resize it with a pinch gesture and move it by dragging it with your finger (it must be selected before it can be resized or moved, however). Set the step length and standard deviation both to about 20 pixels and let your robot go for a walk. Chances are, it will get stuck on one of the two hills, usually the one that is closest to the starting point. This illustrates a potential danger of using MCMC to approximate probability distributions. If the target distribution is multimodal, the run may look perfectly normal and yet miss sampling one of the two hills entirely!

This is where multiple chains helps. Choose to run 4 chains, but show only chain 1 (the cold chain). Now you should see lots of jumps between the two hills when you let the robot take a walk. Set steps/foray to 10000 and let the robot go on 5 forays to generate 50000 points. Now turn off trajectories (Show trajectory setting is OFF) and show all chains. You should see that the cold chain does a nice job of approximating the bimodal target distribution, but the other chains, which are heated to varying degrees, travel much further afield than the cold chain robot. The heated chains serve as scouts. After each robot takes a step, two robots are chosen randomly and asked if they can swap places. Each of the two robots proposes to go to the place where the other one is standing, and if both moves are accepted, a swap takes place. This process often allows the cold chain robot to jump the deep valley between the two hills, which it would have trouble doing without scouts showing the way.

Tutorial Step 5: Changing Settings