Sampling Using Diffusion Processes, from Langevin to Schrödinger

These notes are based on the tutorial I gave at the Geometric Methods in Optimization and Sampling Boot Camp at the Simons Institute in Berkeley.

Suppose we wish to obtain samples from some probability measure {\mu} on {{\mathbb R}^d}. If {\mu} has a sufficiently well-behaved density {f} with respect to the Lebesgue measure, i.e., {\mu(dx) = f(x) dx}, then we can use the (overdamped) continuous-time Langevin dynamics, governed by the Ito stochastic differential equation (SDE)

\displaystyle  d X_t = \frac{1}{2}\nabla \log f(X_t) dt + dW_t, \qquad t \ge 0 \ \ \ \ \ (1)

where the initial condition {X_0} is generated according to some probability law {\mu_0}, and {(W_t)_{t \ge 0}} is the standard {d}-dimensional Brownian motion. Let {\mu_t} denote the probability law of {X_t}. Then, under appropriate regularity conditions on {f}, one can establish the following:

  • {\mu} is the unique invariant distribution of (1), i.e., if {\mu_0 = \mu}, then {\mu_t = \mu} for all {t}.
  • {\mu_t} converges to {\mu} in a suitable sense as {t \rightarrow \infty} — in fact, it is often possible to show that there exists a constant {c > 0} that depends only on {\mu}, such that one has the exponential convergence to equilibrium

    \displaystyle  		{\rm dist}(\mu_t, \mu) \le e^{-t/c}{\rm dist}(\mu_0, \mu)

    for some distance between probability measures on {{\mathbb R}^d}.

In this sense, the Langevin process (1) gives only approximate samples from {\mu}. I would like to discuss an alternative approach that uses diffusion processes to obtain exact samples in finite time. This approach is based on ideas that appeared in two papers from the 1930s by Erwin Schrödinger in the context of physics, and is now referred to as the Schrödinger bridge problem.

Continue reading “Sampling Using Diffusion Processes, from Langevin to Schrödinger”