Statistical Aspects of Optimal Transport

Tuesday, May 16, 2023 - 1:30pm to 2:30pm

Event Calendar Category

LIDS Thesis Defense

Speaker Name

Austin J. Stromme

Affiliation

LIDS & IDSS

Building and Room number

2-449

Abstract

Optimal transport (OT) is a powerful framework for comparing and interpolating probability measures that has had extensive theoretical and practical application in recent years. Unfortunately, OT suffers from a well-known statistical curse of dimensionality, motivating an intense effort to adapt OT to the high-dimensional settings common in modern data. In this thesis, we take a statistical perspective and contribute to this effort by identifying a number of settings where the curse of dimensionality can be overcome with practical algorithms that achieve practical statistical rates. In practice, vanilla OT is less common than an entropically regularized variant, entropic OT, which has long been understood to provide an essential form of computational regularization by affording the use of faster and simpler algorithms. While the curse of dimensionality for vanilla OT is classical, the statistical effects of entropic regularization have remained mysterious. We present two statistical results, tailored to different regimes, for entropic OT: one which clarifies the statistical meaning of entropic regularization as a distance scale, and another that identifies surprising dimension-free behavior. Outside of direct applications of OT between two distributions, practitioners are often interested in leveraging the geometric structure of OT to manipulate many distributions at once, and a natural and a very popular approach in this regard is the OT barycenter. Barycenters generalize averages to non-Euclidean spaces, and we provide fundamental results characterizing the convergence of empirical barycenters to their population counterparts in a broad class of geodesic metric spaces, with application to OT barycenters. Finally, we study first-order methods for computing OT barycenters in the Gaussian case, establishing dimension-free global rates of convergence despite the non-convexity of the objective. Altogether, our theoretical results push the statistical and algorithmic limits of OT in directions of relevance practitioners.

 

Committee: Philippe Rigollet (supervisor), Guy Bresler, Martin Wainwright