A Neural Network MCMC Sampler That Maximizes Proposal Entropy

Zengyi Li; Yubei Chen; Friedrich T Sommer

doi:10.3390/e23030269

A Neural Network MCMC Sampler That Maximizes Proposal Entropy

Entropy (Basel). 2021 Feb 25;23(3):269. doi: 10.3390/e23030269.

Authors

Zengyi Li^{1

2}, Yubei Chen^{1

3}, Friedrich T Sommer^{1

4

5}

Affiliations

¹ Redwood Center for Theoretical Neuroscience, Berkeley, CA 94720, USA.
² Department of Physics, University of California Berkeley, Berkeley, CA 94720, USA.
³ Berkeley AI Research, University of California Berkeley, Berkeley, CA 94720, USA.
⁴ Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA.
⁵ Neuromorphic Computing Group, Intel Labs, 2200 Mission College Blvd., Santa Clara, CA 95054-1549, USA.

Abstract

Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network-based samplers were trained with objectives that either did not explicitly encourage exploration, or contained a term that encouraged exploration but only for well structured distributions. Here we propose to maximize proposal entropy for adapting the proposal to distributions of any shape. To optimize proposal entropy directly, we devised a neural network MCMC sampler that has a flexible and tractable proposal distribution. Specifically, our network architecture utilizes the gradient of the target distribution for generating proposals. Our model achieved significantly higher efficiency than previous neural network MCMC techniques in a variety of sampling tasks, sometimes by more than an order magnitude. Further, the sampler was demonstrated through the training of a convergent energy-based model of natural images. The adaptive sampler achieved unbiased sampling with significantly higher proposal entropy than a Langevin dynamics sample. The trained sampler also achieved better sample quality.

Keywords: MCMC; energy-based model; maximum entropy; neural network sampler.

Abstract

Grants and funding