Interpreting and Improving Diffusion Models From an Optimization Perspective

Tuesday, December 5, 2023 - 1:00pm

Event Calendar Category

Other LIDS Events

Speaker Name

Chenyang Yuan

Affiliation

Toyota Research Institute

Building and Room Number

32-D677

Diffusion models have been interpreted as the reversal of a stochastic process that corrupts clean data with increasing levels of random noise. This reverse process can also be interpreted as likelihood maximization of a noise-perturbed data-distribution using learned gradients (also known as score functions). While these interpretations are inherently probabilistic, the samplers widely used in practice are often deterministic. In this work, we tackle this divide and provide a deterministic framework for reasoning about, improving and potentially discovering new applications of diffusion models.

We propose a new interpretation of diffusion models based on the intuition that under the manifold hypothesis, adding noise to an image on a low-dimensional manifold perturbs it in an orthogonal direction from the manifold, hence learning the noise is approximately learning the projection back onto the manifold. Specifically, we interpret diffusion as noisy gradient descent applied to the squared-distance function to the image manifold. Under a series of simple assumptions on the learned denoiser, we provide new error analysis of sampling algorithms such as DDIM and DDPM.

Finally, we propose a new sampler based on two simple modifications to DDIM using insights from our theoretical results. In as few as 5-10 function evaluations, our sampler achieves state-of-the-art FID scores on pretrained CIFAR-10 and CelebA models and can generate high quality samples on latent diffusion models.

Based on joint work (https://arxiv.org/abs/2306.04848) with Frank Permenter.

Chenyang is a research scientist at the Toyota Reseach Institute in Cambridge MA. He completed his PhD in 2022 at the Laboratory of Information and Decision Systems (LIDS) and the Electrical Engineering and Computer Science (EECS) department at MIT. Before his PhD, he graduated from UC Berkeley with a B.A in Computer Science. He is interested in convex optimization, semidefinite relaxations, large-scale problems, machine learning, and their applications in robotics.

https://chenyang.co/