Wednesday, March 8, 2017 - 4:30pm
Event Calendar Category
LIDS & Stats Tea
Building and Room Number
Consider the following game-theoretic model of sequential prediction: at every round 1 through T, the learner plays an action, the opponent observes this action and plays a response, and the learner incurs the square difference as a loss. The learner's hope is to have low regret, which is the difference between the accumulated loss loss and the loss of the best single action. However, instead of searching for an algorithm with upper and lower bounds of matching orders, we focus on the more robust notion of the minimax optimal algorithm. This algorithm guarantees the smallest regret against the worst sequence of responses, and playing it requires computing the optimal action at every possible game history (i.e. solving the game tree). Since minimax algorithms are entirely derived from the structure of the game, they have a natural, game-dependent regularization and no reliance on tuning parameters.