Non-reversible parallel tempering (NRPT) is an effective algorithm for sampling from distributions with complex geometry, such as those arising from posterior distributions of weakly identifiable and high-dimensional Bayesian models or Gibbs distributions in statistical mechanics. In this work we introduce methods for the automated tuning of NRPT and establish convergence results that explain its observed empirical success. A central feature of all methods that we consider is that they can be fully automated and are robust, enabling them to be used in software with minimal hassle for the user, as evidenced by their application to open problems in astrophysics by our collaborators. Furthermore, the methods are all parallelizable and scale well with modern computational resources.
We begin with a study of how to bridge NRPT and variational inference in order to obtain more effective samplers. To do so, we introduce a generalized annealing path connecting the posterior to an adaptively tuned variational reference, where the reference is tuned to minimize the forward (inclusive) KL divergence to the posterior. To easily tune a general class of such variational families, we introduce AutoGD: a gradient descent method that automatically adapts its learning rate at each iteration. Our theory establishes the convergence of AutoGD, recovering the optimal rate of gradient descent (up to a constant) for a broad class of functions. Finally, to shed light on the empirical success of NRPT, we establish its uniform (geometric) ergodicity under a model of efficient local exploration. We obtain analogous ergodicity results for classical reversible parallel tempering, providing new evidence that NRPT dominates its reversible counterpart.
To join this seminar virtually, please request Zoom connection details from ea@stat.ubc.ca.
Speaker's page: https://nikola-sur.com/
Location: ESB 4192 / Zoom
Event date: -
Speaker: Nikola Surjanovic, UBC Statistics Ph.D. student