Monday, May 1, 2023 - 10:00am
Event Calendar Category
LIDS Thesis Defense
Speaker Name
Yunzong Xu
Affiliation
IDSS & LIDS
Building and Room number
E18-304
Join Zoom meeting
https://mit.zoom.us/j/99444768630
Machine learning is playing an increasingly important role in decision making, with key applications ranging from recommendation systems and dynamic pricing to personalized medicine and clinical trials. While statistical machine learning traditionally excels at making predictions based on based on i.i.d. offline data, many modern decision-making tasks require making dynamic decisions based on data collected online. This thesis aims to bridge this discrepancy and advance the theory and practice of data-driven dynamic decision making. To achieve this goal, we develop methodologies that automatically translate advances in statistical learning into effective dynamic decision making. Focusing on contextual bandits, a core class of online decision-making problems, we present the first optimal and efficient reduction from contextual bandits to offline regression. An important consequence of our results is that advances in offline regression immediately translate to contextual bandits, statistically and computationally. We illustrate the advantages of our results through new guarantees in complex operational environments and experiments on real-world datasets. We then discuss how our results can be extended to more challenging setups, including reinforcement learning in large state spaces. Beyond the positive algorithmic results, this thesis establishes new fundamental limits for general, unstructured reinforcement learning, emphasizing the importance of problem structures in reinforcement learning. Altogether, these results contribute to an improved understanding of the statistical and computational complexity of data-driven dynamic decision making.
Committee:
David Simchi-Levi (supervisor)
Alexander Rakhlin
John N. Tsitsiklis