A simple asymptotic model for overparameterized learning

Wednesday, June 8, 2022 - 1:30pm to 2:30pm

Event Calendar Category

Other LIDS Events

Speaker Name

Anant Sahai

Affiliation

UC Berkeley

Building and Room Number

32-D677

Abstract

This talk builds on the signal-processing-centric "harmless interpolation" story that we shared at LIDS back in Sep 2019, and sets out to flesh out a simple and stylized asymptotic model for overparameterized learning. The eventual goal remains understanding the mystery: why are deep neural networks able to achieve zero training error and yet generalize well, even when the training data is noisy and there are many more parameters than data points? One nice thing about the stylized linear model we study is that the connection to signal-processing thinking gives us heuristic ways to conjecture asymptotic behavior. After recapping the essential story for regression problems, we discuss both binary classification and multi-class classification in an asymptotic regime where the number of classes can grow with the number of data points. It turns out that we are able to verify what our heuristics allowed us to conjecture: the existence of asymptotic regimes where classification can generalize even though the corresponding regression problems would not be able to generalize. Finally, reflecting upon the asymptotic behavior in these regimes allows us to engage in counterexample-style thinking that sheds light on the (dis)connection between test loss and training loss in overparameterized settings.

Biography

Anant Sahai did his undergraduate work in EECS at UC Berkeley, and then went to MIT as a graduate student studying Electrical Engineering and Computer Sciences. After graduating with his PhD from LIDS, and before joining the Berkeley faculty, he was on the theoretical/algorithmic side of a team at the de-facto LIDS startup Enuvis, Inc. developing new adaptive software radio techniques for GPS in very low SNR environments (such as those encountered indoors in urban areas). His research interests span information theory, decentralized control, machine learning, and wireless communication --- with a particular interest at the intersections of these fields. He's the lead for Data and Machine Learning for the new NSF Center SpectrumX for understanding wireless spectrum. On the teaching side, he has been involved with revamping the core Machine Learning oriented curriculum at Berkeley --- in fact, ideas discussed in this talk actually made it into the main ML class at Berkeley.