Monday, April 1, 2024 - 4:00pm
Event Calendar Category
LIDS Seminar Series
Speaker Name
R. Srikant
Affiliation
UIUC
Building and Room Number
32-155
We consider a version of a policy optimization in reinforcement learning where one has to learn rewards through human feedback. We study the sample complexity of this algorithm and compare it to the sample complexity of an algorithm where the rewards are known a priori. We show that the amount of additional data needed to infer rewards from human feedback is a small fraction of the total amount of data needed for policy optimization. Joint work with Yihan Du, Anna Winnicki, Gal Dalal and Shie Mannor.
R. Srikant is a Grainger Chair in Engineering, Co-Director of the C3.ai Digital Transformation Institute and a Professor of Electrical and Computer Engineering and the Coordinated Science Lab at the University of Illinois Urbana-Champaign. His research interests span machine learning, applied probability and communication networks. He is the recipient of the 2021 ACM SIGMETRICS Achievement Award, the 2019 IEEE Koji Kobayashi Computers and Communication Award and the 2015 IEEE INFOCOM Achievement Award. He has also received several Best Paper awards including the 2015 IEEE INFOCOM Best Paper Award and the 2017 Applied Probability Society Best Publication Award.