Wednesday, March 13, 2019 - 3:00pm to 3:30pm
Event Calendar Category
LIDS & Stats Tea
LIDS & EECS
Building and Room Number
Memorization of data in deep neural networks has become a subject of significant research interest. We prove that overparameterized single layer fully connected autoencoders memorize training data: they produce outputs in (a non-linear version of) the span of the training examples. In contrast to fully connected autoencoders, we prove that depth is necessary for memorization in convolutional autoencoders. Moreover, we observe that adding nonlinearity to deep convolutional autoencoders results in a stronger form of memorization: instead of outputting points in the span of the training images, deep convolutional autoencoders tend to output individual training images. Since convolutional autoencoder components are building blocks of deep convolutional networks, we envision that our findings will shed light on the important phenomenon of memorization in overparameterized deep networks.
Co-authors: Mikhail Belkin, Caroline Uhler
Adit Radha is a Ph.D. student in EECS at MIT advised by Caroline Uhler. He received his BS and M.Eng. from MIT. His research interests are robustness/generalization in deep learning, interpretable neural models, and causal inference.