Tuesday, February 14, 2017 - 4:00pm to Wednesday, February 15, 2017 - 3:55pm
Event Calendar Category
LIDS Seminar Series
Northwestern Kellogg School of Management
Building and Room Number
An information-theoretic perspective on the exploration/exploitation tradeoff
Modern online marketplaces feed themselves: they rely on historical data to optimize content and user-interactions, but it’s the data generated from these interactions that is fed back into the system and used to optimize future interactions. As this cycle continues, good performance requires algorithms capable of learning through sequential interactions, systematically experimenting to gather useful information, and balancing exploration with exploitation.
In this talk, I will formulate a broad family of such online decision-making problems. I'll then present two algorithms: Thompson sampling, which has recently been the focus of much attention in academia and industry, and information-directed sampling, a recent development inspired by a fresh information-theoretic perspective. I will discuss simulation results, associated insights, and an new information-theoretic regret analysis that applies to both algorithms. Time permitting, the talk will also touch on the extension of these ideas to exploration in reinforcement learning.
*This talk is based on joint work with Benjamin Van Roy
Daniel Russo is an assistant professor in Northwestern's Kellogg school of management. He received a PhD from Stanford University in 2015 under the supervision of Ben Van Roy, and spent the 2015-2016 academic year as a post-doc at Microsoft Research New England. Dan's research lies at the intersection of statistical machine learning and sequential decision-making. He works on data-driven online-optimization, multi-armed bandit problems, and reinforcement learning.