LIDS Research Scientist Suvrit Sra and LIDS Affiliate Stefanie Jegelka Awarded NSF BIGDATA Grant

Article Author

November 6, 2017

 

LIDS Principal Research Scientist Suvrit Sra and colleague Stefanie Jegelka (a LIDS affiliate and an Assistant Professor in MIT’s Electrical Engineering and Computer Science Department), have been awarded a Critical Techniques, Technologies and Methodologies for Advancing Foundations and Applications of Big Data Sciences and Engineering (BIGDATA) grant from the National Science Foundation (NSF). Sra is the Principal Investigator and Jegelka the co-Principal Investigator on the project, “Towards Automating Data Analysis: Interpretable, Interactive, and Scalable Learning via Discrete Probability.” This multi-year research effort began October 1, 2017. Its goal is to create a novel suite of models and algorithms for analyzing complex datasets, with a particular focus on three factors crucial for next-generation machine learning: (1) interpretability; (2) interactivity; and (3) automated learning.

As machine learning increasingly permeates advances in science and technology, fundamental improvements in the state of the art could benefit society in a broad range of domains—from health care to materials science. This potential practical impact, combined with the clear need to address machine learning’s conceptual and algorithmic limitations, which include difficulties in generalization and suboptimal use of data, were the motivators for Sra and Jegelka’s proposed research. Grounded in probability theory, their project will lay theoretical foundations for a new set of analytical tools that address some of these limitations. This work will further both the technical frameworks and fundamental methodologies that underpin machine learning, while generating new theoretical questions and research directions.

For more information about the project, please see: https://www.nsf.gov/awardsearch/showAward?AWD_ID=1741341&HistoricalAwards=false

About the award: The NSF BIGDATA program is a major initiative that aims to advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large and complex data sets. These capabilities are needed to: accelerate the progress of scientific discovery and innovation; lead to new fields of inquiry that would not otherwise be possible; encourage the development of new data analytic tools and algorithms; facilitate scalable, accessible, and sustainable data infrastructure; increase understanding of human and social processes and interactions; and promote economic growth and improved health and quality of life.