Near-optimal Learning in Sequential Games

Tuesday, February 28, 2023 - 2:00pm

Event Calendar Category

LIDS Thesis Defense

Speaker Name

Tiancheng Yu

Building and Room number

32-D463 (the Star Conference room)

Abstract

In the presence of multiple learning agents, sequential decision making problems become sequential games. Instead of finding an optimal policy as in single agent setting, now the learning objective is to find a Nash equilibrium. Just like the other machine learning paradigm, reinforcement learning agents are hungry for data, so sample efficiency is always a key concern. In this thesis, we study one of the most fundamental questions in learning in sequential games:
1.(Lower bound) How many samples are necessary to find a Nash equilibrium in sequential game, no matter what learning algorithm is used?
2.(Upper bound) How to design (computationally) efficient learning algorithms with sharp sample complexity guarantees?
When the upper and lower bounds match each other, (minimax) optimal learning is achieved. In this thesis, we present near-optimal learning results for two kinds of sequential games:
1.Markov games (stochastic games), where all the agents can observe the underlying states.
2.Imperfect-information extensive-form games. Where different agents can have different observations given the same state
To achieve near-optimal learning, a series of novel algorithmic idea and analysis tools will be introduced, such as adaptive estimation uncertainty quantification (bonus design), certified policy, balanced exploration and log-partition function (dual) reformulation, which may be of independent interest.

Thesis Supervisor(s): Prof. Suvrit Sra (Advisor), Prof. Constantinos Daskalakis, Prof. Chi Jin (Princeton)