Information Theoretic Interpretation of Deep Neural Networks

Tuesday, March 20, 2018 - 3:00pm to Wednesday, March 21, 2018 - 3:55pm

Event Calendar Category

LIDS Seminar Series

Speaker Name

Lizhong Zheng

Affiliation

MIT

Building and Room Number

32-141

Abstract

In this talk, we formulate a new problem called the "universal feature selection" problem, where we need to select from the high dimensional data a low dimensional feature that can be used to solve, not one, but a family of inference problems. We solve this problem by developing a new information metric that can be used to quantify the semantics of data, and by using a geometric analysis approach. We then show that a number of concepts in information theory and statistics such as the HGR correlation and common information are closely connected to the universal feature selection problem. At the same time, a number of learning algorithms, PCA, Compressed Sensing, FM, deep neural networks, etc., can also be interpreted as implicitly or explicitly solving the same problem, with various forms of constraints.

In particular, we show that based on our approach, we can give an analytical expression to the weights computed in deep neural networks. This gives us the option of either to compute these weights with a separate routine different from the standard training procedure of neural networks or to use the computation results of a neural network for other problems. We will show some experimental results where our theory can help us to design and use neural networks in more flexible and more rational ways.

Biography

Lizhong Zheng received the B.S and M.S. degrees, in 1994 and 1997 respectively, from the Department of Electronic Engineering, Tsinghua University, China, and the Ph.D. degree, in 2002, from the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley. Since 2002, he has been working at MIT, where he is currently a professor of Electrical Engineering. His research interests include information theory, statistical inference, communications, and networks theory. He received Eli Jury award from UC Berkeley in 2002, IEEE Information Theory Society Paper Award in 2003, and NSF CAREER award in 2004, and the AFOSR Young Investigator Award in 2007. He served as an associate editor for IEEE Transactions on Information Theory, and the general co-chair for the IEEE International Symposium on Information Theory in 2012. He is an IEEE fellow.