DIMACS Workshop on Markov Chain Monte Carlo: Synthesizing Theory and Practice

June 4 - 7, 2007
DIMACS Center, CoRE Building, Rutgers University

Jim Fill, Johns Hopkins University, jimfill@jhu.edu
Jim Hobert, University of Florida, jhobert@stat.ufl.edu
Antonietta Mira, University of Insubria (Italy), amira@eco.unisubria.it
Luke Tierney, University of Iowa, luke@stat.uiowa.edu
Peter Winkler, Dartmouth College, peter.winkler@dartmouth.edu
Presented under the auspices of the Special Focus on Discrete Random Systems.

In the past two decades researchers in discrete mathematics and the theory of computing have made enormous theoretical strides in sampling via Markov chains, sometimes known as MCMC (Markov Chain Monte Carlo). A huge variety of computations is now proved to be doable in randomized polynomial time to arbitrary accuracy; included are the volumes of convex bodies in high dimension, the permanent of a matrix, the partition function for some models in statistical physics, and many more. At the same time, statisticians have made huge strides in implementing MCMC in complex applications; the applications of MCMC to Bayesian statistical inference probably outnumber all the computer science and physics applications combined.

Yet there has been little interaction between the theoretical mathematics community and the practical statistics one. In many cases, the theoreticians have gone off in directions of little interest in practical statistics; in others, they may have overlooked directions that could have changed the way things are done. On the other side, statisticians may not be aware of advances that really could impact their methods. Communication between theoreticians and researchers who actually handle data would be extremely beneficial.

Here are some issues dividing theory and practice, where intriguing questions remain for both sides:

    1. Distance to stationarity: In MCMC a Markov chain is run until its state distribution is as close as needed to stationarity. Theoreticians typically require closeness in total variation, meaning that any event has nearly the same probability as it would in stationarity. But, in practice (and even in theory --- for example in computing volumes) only a very limited class of events is of interest. Bayer and Diaconis recognized this fact in recommending seven riffle shuffles of a deck of cards; this still leaves a huge total variation of~30 but in practice is good enough.

    2. Diagnosis of mixing: Closely related is the issue of deciding when, in running an actual Markov chain, to stop and sample. If coupling is easy to diagnose, as in a monotone chain, there may be a convenient criterion. But in many cases, there is no good bound on the mixing time and we don't really know when to stop. For example, László Lovász has asked the following simple question: Suppose that proper colorings of a graph are being sampled by means of local recoloring (heat bath), and suppose we have reached the point where the color of a particular vertex is close to uniformly random. Does this mean the coloring of the whole graph is now uniformly random? What criteria are used in practice to diagnose mixing?

    3. Exact sampling: Propp, Wilson, Fill and others have developed some startling methods of sampling from the stationary distribution of a Markov chain. Are these methods useful in practice? If so, are they being used?

    4. Random updates vs. sweeps: In many cases, Markov chains proceeding by local dynamics, with sites selected uniformly at random, are known to be rapidly mixing. However, in practice, it is easier to do updates by sweeping through the sites repeatedly in some fixed order. It appears empirically (and intuitively) that sweeping is just as good as random updates, but proof is lacking except in special cases.

    5. Heuristically fast chains: Similarly, theoreticians have shown that certain Markov chains (like the ``ball walk'' inside a convex body in euclidean space) mix rapidly. But other chains --- e.g., the ``hit-and-run'' chain where a random line through the current point is chosen, then a point taken uniformly from the intersection of the line with the body --- may well be faster, yet harder to analyze. More generally, there is often some flexibility in designing a Markov chain to achieve a given stationary distribution. For analysis purposes, it is desirable to make the chain simple; for practical purposes, one wishes to design the chain to admit easy updates and move quickly around the state space. Can these be reconciled?

    6. Competing methods: MCMC is popular in the ``theory community'' of computer science, but in the statistical world it is only one of many methods for random sampling. Even when a Markov chain is already in the picture, heuristic approaches like sequential importance sampling and variations of simulated annealing or simulated tempering may be superior in practice to straight, analyzable running of the chain. Is there any way to know when this will be the case?

In the workshop, we propose to address these issues both generally and in special cases; for example, in the generation of random contingency tables (matrices with given row and column sums); in genomics; and in graphical models. Our planned invitees include many of the major contributors to MCMC, but leaning toward those who have demonstrated interest in practical considerations; and a sampling, hopefully somewhat representative, of potential consumers of MCMC theory from the statistics community and elsewhere.

We plan to begin with a one-day tutorial on MCMC and on sampling issues in general, understandable to graduate students and a general math/statistics/computer science audience. The tutorial will be about half devoted to mathematical results on MCMC and half to MCMC in statistics.

Next: Call for Participation
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on October 25, 2006.