DIMACS Workshop on Markov Chain Monte Carlo: Synthesizing Theory and Practice

June 4 - 7, 2007
DIMACS Center, CoRE Building, Rutgers University

Organizers:
Jim Fill, Johns Hopkins University, jimfill@jhu.edu
Jim Hobert, University of Florida, jhobert@stat.ufl.edu
Antonietta Mira, University of Insubria (Italy), amira@eco.unisubria.it
Luke Tierney, University of Iowa, luke@stat.uiowa.edu
Peter Winkler, Dartmouth College, peter.winkler@dartmouth.edu
Presented under the auspices of the Special Focus on Discrete Random Systems.
Yves Atchadé, University of Michigan

Title: Efficiency of Adaptive MCMC

Recently, many adaptive MCMC algorithms have been proposed partly as a solution to the parameter-tuning problem in classical MCMC. The idea is to design Monte Carlo samplers that can optimize their own parameters for a given sampling problem. This talk will discuss the efficiency of such procedures; the notion of "efficiency" concerns the possibility for adaptive MCMC samplers to behave like the "optimal" MCMC sampler. (This is joint work with Christophe Andrieu, Bristol University.)


Isabel Beichl, NIST

Title: Using SIS to Speed Up MCMC

We describe strategies for speeding up MCMC so that it can be used for practical computation on hard problems, such as estimating the number of monomer-dimer covers of a lattice. Our approach is based first on using aggregation to get a simplified version of the problem. The simplified problem is then used to get approximate values that can be used to define fugacity for use in MCMC to obtain accurate results faster.

Our second technique is based on sequential importance sampling (SIS). We show how to use SIS to generate the transition matrices for the aggregated process and we comment on possible uses of SIS allowing direct approximation of the constants needed for MCMC without requiring aggregation.

(This is joint work with Francis Sullivan, IDA/Center for Computing Sciences.)


Andrew Beveridge, Carnegie Mellon University

Title: Centers for Random Walks on Trees

We consider two distinct centers which arise in measuring how quickly a random walk on a tree mixes. Lovasz and Winkler have shown that stopping rules which "look where they are going" (rather than simply walking a fixed number of steps) can achieve a desired distribution exactly and efficiently. Considering an optimal stopping rule that reflects some aspect of mixing, we can use the expected length of this rule as a mixing measure. On trees, a number of these mixing measures identify particular nodes with central properties. In this context, we study a variety of natural notions of centrality. Each of these criteria identifies the barycenter of the tree as the "average" center and the newly defined focus as the "extremal" center.


Ivona Bezakova, Rochester Institute of Technology

Title: Binary Contingency Tables in Theory and Practice

The binary contingency tables problem, arising in the design of statistical tests, is to estimate the number of 0/1 matrices with prescribed row and column sums. Several approaches have been considered for solving the problem -- most notably sequential importance sampling (SIS) and simulated annealing. We will describe both methods and discuss their limitations.

The sequential importance sampling method of Chen, Diaconis, Holmes, and Liu is guaranteed to converge to the correct count and it appears to work well in practice. Recently, Blanchet showed that for n x n tables with row sums and column sums bounded by o(n^{1/4}) the method converges in O(n^2) trials. However, in the worst case the required time is exponential. More precisely, with Sinclair, Stefankovic, and Vigoda we give a family of examples in which the SIS procedure, if run for any subexponential number of trials, will underestimate the number of tables by an exponential factor. This result holds for any of the usual design choices in the SIS algorithm, namely, the ordering of the columns and rows.

On the other hand, a simulated annealing approach yields a fully polynomial approximation scheme (fpras) for any input, as was first shown by Jerrum, Sinclair, and Vigoda. In this talk we describe a joint result with Bhatnagar and Vigoda which uses a different and faster simulated annealing approach; the main idea is to start the annealing process with a nontrivial instance. This is the currently fastest fpras for the problem, running in time O(n^{11}) for n x n tables.


Jose H. Blanchet, Harvard University

Title: Designing and Testing Efficient Sequential Importance Sampling Algorithms

Sequential importance sampling has been reported to give excellent empirical performance in practical problems -- some of them related to statistical inference and others related to performance analysis of stochastic systems in engineering. The theoretical analysis of importance samplers has a long tradition within the rare-event analysis community and some of the techniques that have been developed recently are useful in applications of interest to computer scientists and statisticians. In order to illustrate these techniques, I'll study in detail a sequential importance sampler proposed by Chen, Diaconis, Holmes, and Liu (2005) for the statistical analysis of binary contingency tables with fixed margins. Chen et al (2005) use their algorithm to count the number of such tables. We shall illustrate how our techniques, based on Lyapunov bounds for Markov chains, allow us to compute a quantitative bound for the variance of the estimator of Chen et al (2005) and also prove that such algorithm allows (assuming that the margins grow slowly as the size of the table increases) for the approximate counting of the number of such tables with basically quadratic complexity.


Kate Cowles, The University of Iowa

Title: Computing for Bayesian Spatial Estimation and Prediction with Application to Residential Radon

Because exposure to radon gas in buildings is a likely risk factor for lung cancer, estimation of residential radon levels is an important public health endeavor. We describe an efficient MCMC algorithm that enables fitting a Bayesian geostatistical model that appropriately combines areal radon data with point-source measurements of indoor home radon in the state of Iowa.

(This is joint work with Brian Smith and Jun Yan.)


Murali Haran, The Pennsylvania State University

Title: Exact and Approximate Monte Carlo for Spatial Models

Hierarchical Bayesian approaches to modeling spatial data are becoming increasingly common. However, even fairly standard spatial models often result in posterior distributions that present challenges to non-expert users of Markov chain Monte Carlo (MCMC) methods. In addition to having to figure out an appropriate MCMC algorithm for each new problem, MCMC users have two long-standing questions: How long should one run the chain and, once the algorithm is stopped, how accurate are the resulting estimates? I will describe some approaches for resolving these issues for a small but useful class of spatial models. The first approach involves the construction of exact sampling procedures for which these questions are moot since the draws are i.i.d. The second (closely related) approach involves constructing MCMC algorithms with good theoretical properties, which allows for the computation of consistent estimates of Monte Carlo standard errors. These standard errors can then be used to determine the run length of the MCMC algorithm. I will conclude with a comparison of these methods in the context of some data examples.


Jim Hobert, University of Florida

Title: A Theoretical Comparison of the Data Augmentation, Marginal Augmentation, and PX-DA Algorithms

The PX-DA algorithm of Liu and Wu (1999, JASA) and the marginal augmentation (MA) algorithm of Meng and van Dyk (1999, Biometrika) are recently developed alternatives to the standard data augmentation (DA) algorithm. PX-DA and MA often converge much faster than DA and are only slightly more computationally demanding. This talk concerns a family of generalizations of the DA algorithm that includes the PX-DA and MA algorithms. The main result is that each member of the family is more efficient than DA in the sense of smaller asymptotic variance in the central limit theorem. As an example, the DA algorithm of Albert and Chib (1993, JASA) for Bayesian probit regression is compared with the alternative PX-DA algorithm developed by Liu and Wu.

(This is joint work with Dobrin Marchev, Baruch College (CUNY) and Vivekananda Roy, University of Florida.)


Galin Jones, University of Minnesota

Title: Fixed-Width Output Analysis for Markov Chain Monte Carlo

Markov chain Monte Carlo is a method of producing a correlated sample from a target distribution. Features of the target distribution are then estimated using this sample. Thus a fundamental question in MCMC is: When should the sampling stop? That is, when have we achieved good estimates? I will introduce a method that stops the MCMC sampling when the width of a confidence interval is less than a user-specified value. Hence calculating Monte Carlo standard errors is a critical step in assessing the output of the simulation. In this talk I will give an overview of fixed-width methodology and methods for calculating Monte Carlo standard errors. The main results will be illustrated in several examples.


Xiao-Li Meng, Harvard University

Title: Markov Chain Monte Carlo: Practice; or: The Full Monte Carlo: A Live Performance (with Stars)

Markov chain Monte Carlo (MCMC) methods, originating in computational physics about half a century ago, have seen an enormous range of applications in quantitative scientific investigations. This is mainly due to their ability to simulate from very complex distributions such as the ones needed in realistic statistical models. The first part of this tutorial introduces the two most frequently used MCMC algorithms: the Gibbs sampler and the Metropolis-Hastings algorithm. Using simple yet non-trivial examples, we demonstrate, via live performance, the good, bad, and ugly implementations. Audience participations are required, though no prior experiences are needed. In the second part of this tutorial, a newly obtained interweaving Data Augmentation algorithm is presented for fitting a Cox Process model to an astrophysics dataset.


Jesper Møller, Aalborg University (Denmark)

Title: MCMC Algorithms for Distributions with Intractable Normalizing Constants, with a View to Perfect Simulation and Non-parametric Bayesian Inference for Inhomogeneous Markov Point Processes

Sampling from Bayesian posterior distributions are problematic when the likelihood for the parameter of interest involves an intractable normalizing constant which is also a function of that parameter. In [3] we present new methodology for drawing samples from such a posterior distribution without approximation, in [1] we illustrate the method, and in [2] we discuss how to do perfect simulation, which is an important ingredient of the method in [3].

References:

[1] K.K. Berthelsen and J. Møller (2007). Non-parametric Bayesian inference for 
    inhomogeneous Markov point processes. Research Report R-2007-9, Department of 
    Mathematical Sciences, Aalborg University.

[2] W.S. Kendall and J. Møller (2000). Perfect simulation using dominating
    processes on ordered spaces, with application to locally stable point 
    processes. Advances in Applied Probability, 32, 844-865.

[3] J. Møller, A.N. Pettitt, K.K. Berthelsen and R.W. Reeves (2006). An efficient 
    Markov chain Monte Carlo method for distributions with intractable normalising 
    constants. Biometrika, 93, 451-458.

Elchanan Mossel, University of California, Berkeley

Title: Mixing of Gibbs Sampling on Random Graphs

The tree-like structure of random graphs suggests that rapid mixing of Gibbs sampling on random graphs should depend on properties of the corresponding trees. Proving rigorous statements to that effect is a major challenge. I will describe two recent results where such statements were proven.

(This talk is based on joint projects with Dror Weitz and Nick Wormald, and with Allan Sly.)


Yuval Peres, Microsoft Research and University of California, Berkeley

Title: Can Extra Updates Delay Mixing?

We consider Glauber dynamics (starting from an extremal configuration) in a monotone spin system, and show that interjecting extra updates cannot increase the expected Hamming distance and the total variation distance to the stationary distribution. We deduce that for monotone Markov random fields, when block dynamics contracts a weighted Hamming metric, so does single-site dynamics started at extremal configurations. In particular, our result completes work of Kenyon, Mossel, and Peres concerning Glauber dynamics for the Ising model on trees. Our approach also shows that on bipartite $n$-vertex graphs, alternating updates systematically between odd and even vertices cannot improve the mixing time by more than a factor of $\log n$ compared to updates at uniform random locations.

(This is joint work with Peter Winkler.)


Dana Randall, Georgia Institute of Technology

Title: Slow Mixing of Glauber Dynamics and Simulated Tempering Algorithms

Markov chains can be effective tools for sampling elements of large sets, including independent sets and matchings of a graph. These models arise in statistical physics in the context of lattice gases and monomer-dimer models, where there is typically a parameter corresponding to temperature and we are interested in sampling from the Gibbs distribution at various temperatures. For many of these models local Markov chains undergo a phase transition whereby they are efficient at high temperatures and inefficient at low temperatures. We will describe some recent results that rigorously show conditions under which certain Markov chains will require exponential time to converge to equilibrium, including Glauber (local) dynamics and simulated tempering.


Jeffrey Rosenthal, University of Toronto

Title: Markov Chain Monte Carlo: Synthesizing Theory and Practice

This tutorial will discuss ways in which theoretical Markov chain analysis can impact the application of MCMC algorithms to modern statistical inference problems. Specific topics will include: differences between discrete and continuous state space chains; general ergodicity theorems via phi-irreducibility; and the use of coupling and minorization conditions to bound convergence to stationarity. We will also discuss adaptive MCMC algorithms, which update their tuning parameters during their run in an effort to better converge. We will provide a coupling proof of their ergodicity (under appropriate conditions) and consider some examples.


David van Dyk, University of California, Irvine

Title: Implementing Gibbs-type Samplers Using Incompatible Draws, with Applications in High-Energy Astrophysics

Ensuring the compatibility of conditional distributions is commonly emphasized in the constructions of Gibbs samplers: Compatibility is a necessary condition for proper convergence. In this talk I discuss a systematic strategy for the construction of MCMC samplers involving incompatible conditional distributions. This strategy is designed not only to ensure proper convergence to the the target stationary distribution but also to provide chains with quicker convergence and smaller autocorrelations at stationarity. Our method can be viewed as a generalization of blocking in that the systematic strategy sometimes yields a blocked version of the parent Gibbs sampler. It is more general than blocking in that it sometimes yields a set of incompatible conditional distributions that do not correspond to any standard Gibbs sampler. The method was motivated by an applied problem in high-energy astrophysics, and we illustrate its use in three problems stemming from our applied work. Our strategy can be viewed as a stochastic version of the ECME and AECM algorithms. Changing the order of the steps of these EM-type algorithms can destroy their celebrated monotone convergence. Likewise, changing the order of the conditional draws of our samplers can destroy their target stationary distribution. Thus, some care must be taken in deriving and implementing these fast samplers.


Eric Vigoda, Georgia Institute of Technology

Title: Markov Chain Monte Carlo: Theory

I'll begin the tutorial by describing the basic connection between approximate counting (or estimating the partition function) and randomly sampling from the corresponding Gibbs distribution. We'll then dive into the focus of this tutorial: an overview of techniques for bounding the mixing time of finite Markov chains. The emphasis will be on the canonical paths technique and its application to matchings/permanent, and the coupling technique.


Jinfeng Zhang, Harvard University

Title: Biopolymer Structure Simulation and Optimization via Fragment Re-growth Monte Carlo

An efficient exploration of the configuration space of a biopolymer is essential for its structure modeling and prediction. In this study, we propose a new Monte Carlo method, Fragment Re-growth via Energy-guided Sequential Sampling (FRESS), which incorporates the idea of multi-grid Monte Carlo into the framework of configurational-bias Monte Carlo and is suitable for chain polymer simulations. As a byproduct, we also found a novel extension of the Metropolis Monte Carlo framework applicable to all Monte Carlo computations. We tested FRESS on hydrophobic-hydrophilic (HP) protein folding models in both two and three dimensions. For the benchmark sequences, FRESS not only found all the minimum energies obtained by previous studies with substantially less computation time, but also found new lower energies for all the three-dimensional HP models with sequence length longer than 80 residues.

(This is joint work with department colleagues Sam C. Kou and Jun S. Liu.)


Previous: Program
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on May 22, 2007.