The minimax algorithm for a sequential prediction game with square loss

Wednesday, March 8, 2017 - 4:30pm

Event Calendar Category

LIDS & Stats Tea

Speaker Name

Alan Malek

Affiliation

IDSS

Building and Room Number

LIDS Lounge

Consider the following game-theoretic model of sequential prediction: at every round 1 through T, the learner plays an action, the opponent observes this action and plays a response, and the learner incurs the square difference as a loss. The learner's hope is to have low regret, which is the difference between the accumulated loss loss and the loss of the best single action. However, instead of searching for an algorithm with upper and lower bounds of matching orders, we focus on the more robust notion of the minimax optimal algorithm. This algorithm guarantees the smallest regret against the worst sequence of responses, and playing it requires computing the optimal action at every possible game history (i.e. solving the game tree). Since minimax algorithms are entirely derived from the structure of the game, they have a natural, game-dependent regularization and no reliance on tuning parameters.