Non-Convex Learning via Stochastic Gradient Langevin Dynamics

Thursday, April 27, 2017 - 4:00pm to Friday, April 28, 2017 - 3:55pm

Event Calendar Category

LIDS Seminar Series

Speaker Name

Maxim Raginsky

Affiliation

University of Illinois

Building and Room number

32-155

Abstract

Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Stochastic Gradient Descent, where properly scaled isotropic Gaussian noise is added to an unbiased estimate of the gradient at each iteration. This modest change allows SGLD to escape local minima and suffices to guarantee asymptotic convergence to global minimizers for sufficiently regular non-convex objectives. In this talk, I will present a nonasymptotic analysis in the context of non-convex learning problems, giving finite-time guarantees for SGLD to find approximate minimizers of both empirical and population risks. As in the asymptotic setting, the analysis relates the discrete-time SGLD Markov chain to a continuous-time diffusion process. A new tool that drives the results is the use of weighted transportation cost inequalities to quantify the rate of convergence of SGLD to a stationary distribution in the Euclidean $2$-Wasserstein distance. This talk is based on joint work with Sasha Rakhlin and Matus Telgarsky.

Biography

Maxim Raginsky received the B.S. and M.S. degrees in 2000 and the Ph.D. degree in 2002 from Northwestern University, all in electrical engineering. He has held research positions with Northwestern, University of Illinois at Urbana-Champaign (where he was a Beckman Foundation Fellow from 2004 to 2007), and Duke University. In 2012, he returned to UIUC, where he is currently an Assistant Professor and William L. Everitt Fellow in Electrical and Computer Engineering. He is also a faculty member of the Signals, Inference, and Networks (SINE) group and the Decision and Control group in the Coordinated Science Laboratory. Dr. Raginsky received a Faculty Early Career Development (CAREER) Award from the National Science Foundation in 2013. His research interests lie at the intersection of information theory, machine learning, and control. He is a member of the editorial boards of Foundations and Trends in Communications and Information Theory and IEEE Transactions on Network Science and Engineering.