Transport and Beyond: Efficient Optimization over Probability Distributions

Friday, June 24, 2022 - 10:00am to 11:30am

Event Calendar Category

LIDS Thesis Defense

Speaker Name

Jason Altschuler

Affiliation

EECS & LIDS

Building and Room Number

32-D677

Join Zoom meeting

https://mit.zoom.us/j/96866462054

Abstract

The core of classical optimization focuses on the setting where decision variables are vectors in R^d. However, modern applications throughout machine learning, data science, and engineering demand high-dimensional optimization problems where decision variables are probability distributions. Can such optimization problems be solved efficiently? This thesis presents two interrelated lines of work in this direction through the common thread of Optimal Transport. In this talk, I will briefly summarize both parts.

First, we consider Optimal Transport and other optimization problems over joint distributions with two constrained marginals. Such tasks are fundamental in alignment problems, matrix problems, graph problems, and more. We establish near-linear runtimes for approximation algorithms for several classical problems under this umbrella: Optimal Transport, Minimum-Mean-Cycle, Matrix Balancing, and Matrix Scaling. Two recurring key themes are the use of entropic regularization for exploiting separability of optimization constraints, and the use of probabilistic inequalities for obtaining dimension-free convergence bounds. A dictionary is presented that unifies these various problems, which were historically studied in disparate communities.

Second, we consider Multimarginal Optimal Transport (MOT) and other optimization problems over joint distributions with many constrained marginals. Despite syntactic similarities with the problems in part I, these problems require fundamentally different algorithms. The key issue is that in general, MOT requires exponential time in the number of marginals k and their support sizes n. We develop a general theory about what “structure” makes MOT solvable in polynomial time in n and k. We demonstrate this general theory on applications in diverse fields ranging from operations research to data science to fluid dynamics to quantum chemistry. We dedicate special attention to the popular application of Wasserstein barycenters -- resolving the complexity of this problem and uncovering the subtle dependence of the dimension on the answer.

Thesis Committee:

Pablo Parrilo, EECS (advisor)

Guy Bresler, EECS

Philippe Rigollet, Mathematics