Memorization in Overparameterized Autoencoders

Wednesday, March 13, 2019 - 3:00pm to 3:30pm

Event Calendar Category

LIDS & Stats Tea

Speaker Name

Adit Radha

Affiliation

LIDS & EECS

Building and Room Number

LIDS Lounge

Memorization of data in deep neural networks has become a subject of significant research interest. We prove that overparameterized single layer fully connected autoencoders memorize training data: they produce outputs in (a non-linear version of) the span of the training examples.   In contrast to fully connected autoencoders, we prove that depth is necessary for memorization in convolutional autoencoders.  Moreover, we observe that adding nonlinearity to deep convolutional autoencoders results in a stronger form of memorization: instead of outputting points in the span of the training images, deep convolutional autoencoders tend to output individual training images. Since convolutional autoencoder components are building blocks of deep convolutional networks, we envision that our findings will shed light on the important phenomenon of memorization in overparameterized deep networks.

Co-authors: Mikhail Belkin, Caroline Uhler

Adit Radha is a Ph.D. student in EECS at MIT advised by Caroline Uhler.  He received his BS and M.Eng. from MIT.  His research interests are robustness/generalization in deep learning, interpretable neural models, and causal inference.