Theory of Computation Colloquium Series: Learning Models: Connections Between Boosting, Hard-core Distributions, Dense Models, GAN, and Regularity

Tuesday, September 12, 2017 - 4:00pm to Wednesday, September 13, 2017 - 3:55pm

Event Calendar Category

CSAIL

Speaker Name

Russell Impagliazzo

Affiliation

University of California, San Diego

Building and Room number

32-G449

Abstract

A theme that cuts across many domains of computer science and mathematics is to find simple representations of complex mathematical objects such as graphs, functions, or distributions on data. These representations need to capture how the object interacts with a class of tests, and to approximately determine the outcome of these tests.

For example, a scientist is trying to find a mathematical model explaining observed data about some phenomenon, such as kinds of butterflies in a forest.  A minimal criterion for success is that the model should accurately predict the results of future observations.  When is this possible?   This general situation arises in many contexts in computer science and mathematics.  In machine learning, the object might be a distribution on data points, high dimensional real vectors, and the tests might be half-spaces. The goal would be to learn a simple representation of the data that determines the probability of any half-space or possibly intersections of half spaces. In computational complexity, the object might be a Boolean function or distribution on strings, and the tests are functions of low circuit complexity. In graph theory, the object is a large graph, and the tests are the cuts In the graph; the representation should determine approximately the size of any cut. In additive combinatorics, the object might be a function or distribution over an Abelian group, and the tests might be correlations with linear functions or polynomials.

This talk is to illustrate the connections between these various contexts. 

In particular, we'll show direct reductions between:

From machine learning, a boosting algorithm converts  a weak learner, that finds a hypothesis that has a small correlation to an unknown , to a strong learner, that finds a hypothesis agreeing with the function almost everywhere.

From complexity theory, a hard-core distribution lemma says that any Boolean function which cannot be computed with success more than on random inputs, has a sub-distribution of size where the function is pseudo-random

Initially from additive combinatorics.  Dense model theorems say that any distribution either has a simple witness that it is small (has low min-entropy) or has a simple indistinguishable distribution that is large (high min-entropy). 

From machine learning, generative adversarial networks (GANs) are algorithms that use a distinguisher that finds differences between a target distribution and a sampling algorithm in order to refine the sampling algorithm.  Algorithmic versions of dense model theorems can be viewed as analogous to GAN's in many ways, but can have stronger provable guarantees.

Initially from graph theory, a regularity lemma shows that an arbitrary mathematical object has a simply described approximation. 

Many of these connections were pointed out by Trevisan Tulsiani and Vadhan, and by Reingold, Trevisan, Tulsiani and Vadhan.  However, our reductions are more black-box, giving us greater versatility.

While many of our connections are used to prove known results in new ways, we also obtain some novel improvements and generalizations, both by exploiting the generality of these reductions and the wide assortment of boosting algorithms, and by applying these results recursively.