往期活动

Understanding and Accelerating Statistical Sampling via PDEs and Deep Learning

Abstract

A fundamental problem in Bayesian inference and statistical machine learning is to efficiently sample from probability distributions. Standard Markov chain Monte Carlo methods could be prohibitively expensive due to various complexities of the target distribution, such as multimodality, high dimensionality, large datasets, etc. To improve the sampling efficiency, several new interesting ideas/methods have recently been proposed in the community of machine learning, whereas their theoretical analysis are very little understood.

In the first part of the talk, I aim to demonstrate how PDE analysis can be useful to understand some recently proposed sampling algorithms. Specifically, I will focus on the Stein variational gradient descent (SVGD), which is a popular particle sampling algorithm used in the machine learning community. I justify rigorously SVGD as a sampling algorithm through a mean field analysis. I will also introduce a new birth-death dynamics, which can be used as a universal strategy for accelerating existing sampling algorithms. The acceleration effect of the birth-death dynamics is examined carefully when applied to the classical Langevin diffusion. For both SVGD dynamics and the birth-death dynamics, I will emphasize the (Wasserstein) gradient flow structure and the convergence to the equilibrium of the underlying PDE dynamics.
The second part of the talk devotes to learning implicit generative models for sampling. Generative model such as Generative Adversarial Network (GAN) provides an important framework for learning and sampling from complex distributions. Despite the celebrated empirical success, many theoretical questions remain unsolved. A fundamental open question is: how well can deep neural networks express distributions?  I will answer this question by proving a universal approximation theorem of deep neural networks for generating distributions.