Model-Based Reinforcement Learning for Countably Infinite State Space MDP

Wednesday, September 9, 2020 - 4:00pm to 4:30pm

Event Calendar Category

LIDS & Stats Tea

Speaker Name

Bai Liu

Affiliation

LIDS

Zoom meeting id

993 8044 5463

Join Zoom meeting

https://mit.zoom.us/j/99380445463

With the rapid advance of information technology, network systems have become increasingly complex and hence the underlying system dynamics are typically unknown or difficult to characterize. Finding a good network control policy is of significant importance to achieving desirable network performance (e.g., high throughput or low average job delay). Online/sequential learning algorithms are well-suited to learning the optimal control policy from observed data for systems without the information of underlying dynamics. In this work, we consider using model-based reinforcement learning (RL) to learn the optimal control policy of queueing networks so that the average job delay (or equivalently the average queue backlog) is minimized. Existing RL techniques, however, cannot handle the unbounded state spaces of the network control problem. To overcome this difficulty, we propose a new algorithm, called Reinforcement Learning for Queueing Networks (RL-QN), which applies model-based RL methods over a finite subset of the state space. We establish that the average queue backlog under RL-QN with an appropriately constructed subset can be arbitrarily close to the optimal result. We evaluate RL-QN in dynamic server allocation and routing problems. Simulations show that RL-QN minimizes the average queue backlog effectively.

Bai is a Ph.D. student in LIDS advised by Prof. Eytan Modiano. His research interests lie in learning and control problems in networked systems (data networks, logistic networks, etc.), with the application of reinforcement learning, stochastic optimization, and inference methods.