Thesis Defense: Sequential Data Inference via Matrix Estimation: Causal Inference, Cricket and Retail

Monday, July 16, 2018 - 10:30am

Event Calendar Category

LIDS Thesis Defense

Speaker Name

Muhammad Jehangir Amjad

Affiliation

LIDS

Building and Room Number

32-D677

Abstract

This thesis proposes a unified framework to capture the temporal and longitudinal variation across multiple instances of sequential data. Examples of such data include sales of a product over a period of time across several retail locations; trajectories of scores across cricket games; and annual tobacco consumption across the United States over a period of decades. A key component of our work is the latent variable model (LVM) which views the sequential data as a matrix where the rows correspond to multiple sequences while the columns represent the sequential aspect. The goal is to utilize information in the data within the sequence and across different sequences to address two inferential questions: (a) imputation or ‘filling missing values’ and ‘de-noising’ observed values, and (b) forecasting or predicting ‘future’ values, for a given sequence of data.

 

Using this framework, we build upon the recent developments in ‘matrix estimation’ to address the inferential goals in three different applications. First, a robust variant of the popular ‘synthetic control’ method used in observational studies to draw causal statistical inferences. Second, a score trajectory forecasting algorithm for the game of cricket using historical data. This leads to an unbiased target resetting algorithm for shortened cricket games which is an improvement upon the biased incumbent approach (Duckworth-Lewis-Stern). Third, an algorithm which leads to a consistent estimator for the time- and location-varying demand of products using censored observations in the context of retail. As a final contribution, the algorithms presented are implemented and packaged as a scalable open-source library for the imputation and forecasting of sequential data with applications beyond those presented in this work.

 

THESIS COMMITTEE CHAIR:

Prof. Devavrat Shah (EECS, MIT)

 

THESIS COMMITTEE:

Prof. Alberto Abadie (Economics, MIT)

Prof. Vivek Farias (Sloan, MIT)

Prof. Vishal Mishra (Computer Science, Columbia)

Prof. John Tsitsiklis (EECS, MIT)