Engineering and Learning Visual Representations: Invariance, Sufficiency, and the role of Control

Tuesday, November 18, 2014 - 4:00pm to Wednesday, November 19, 2014 - 3:55pm

Event Calendar Category

LIDS Seminar Series

Speaker Name

Stefano Soatto

Affiliation

UCLA

Building and Room Number

32-141

Visual Representations are functions of visual data tailored for decision and control tasks, where much of the variability is due to nuisance factors such as viewpoint, illumination, partial occlusion. An optimal representation would be maximally insensitive to those, and maximally "informative" for the task.

I will formalize an expression for the maximal-invariant/minimal-sufficient representation and show that, under very restrictive conditions, it is related to methods currently employed in Computer Vision, such as SIFT, except for a small but important modification. When such a modification is applied, performance on benchmark image matching datasets improves by over 30% above SIFT, surpassing even convolutional neural networks trained on millions of images.

More importantly, such restrictive conditions can be lifted. I will present multi-view representations and discuss their properties in relation to their "informative content" which is maximal when the statistic is complete. I will argue that completeness is achievable by controlling the data acquisition process, in an active sensing/experiment design setting. For visual data, this entails the ability to move in physical space. Thus, it is not just that vision is useful for mobility; conversely, mobility is needed to construct optimal representations.

Professor Soatto received his Ph.D. in Control and Dynamical Systems from the California Institute of Technology in 1996; he joined UCLA in 2000 after being Assistant and then Associate Professor of Electrical and Biomedical Engineering at Washington University, Research Associate in Applied Sciences at Harvard University, and Assistant Professor in Mathematics and Computer Science at the University of Udine, Italy. He received his D.Ing. degree (highest honors) from the University of Padova- Italy in 1992. Dr. Soatto is the recipient of the David Marr Prize (with Y. Ma, J. Kosecka and S. Sastry of U.C. Berkeley) for work on Euclidean reconstruction and reprojection up to subgroups. He also received the Siemens Prize with the Outstanding Paper Award from the IEEE Computer Society for his work on optimal structure from motion (with R. Brockett of Harvard). He received the National Science Foundation Career Award and the Okawa Foundation Grant. He is a Member of the Editorial Board of the International Journal of Computer Vision (IJCV), the International Journal of Mathematical Imaging and Vision (JMIV), SIAM Journal of Imaging Science and Foundations and Trends in Computer Graphics and Vision. He is a Fellow of the IEEE.