Thesis Defense: Investigations in Applied Probability and High-Dimensional Statistics

Tuesday, April 7, 2020 - 3:00pm to Wednesday, April 8, 2020 - 3:55pm

Event Calendar Category

LIDS Thesis Defense

Speaker Name

Julia Gaudio

Affiliation

LIDS & ORC

ZOOM LINK: https://mit.zoom.us/j/157149867

THESIS COMMITTEE:

Prof. David Gamarnik (thesis advisor)

Prof. Patrick Jaillet (thesis advisor)

Prof. Ankur Moitra

Prof. Guy Bresler

ABSTRACT:

This thesis makes contributions to the areas of applied probability and high-dimensional statistics. We introduce the Attracting Random Walks model, which is a Markov chain model on a graph. In the Attracting Random Walks model, particles move among the vertices of a graph transition probabilities depending on the locations of the other particles. The model is designed so that transitions to more occupied vertices are more likely. We analyze the mixing time of the model under different values of the parameter governing the attraction. We additionally consider the repelling version of the model, in which particles are more likely to move to vertices with low occupancy.

Next, we contribute to the methodology of Markov processes by studying convergence rates for Markov processes under perturbation. We specifically consider parametrized stochastically ordered Markov processes, such as queues. We bound the time until a given Markov process converges to stationary after its parameter experiences a perturbation.

The following chapter considers the random instance Traveling Salesman Problem. Namely, n points (cities) are placed uniformly at random in the unit square. It was shown by Beardwood et al (1959) that the optimal tour length through these points, divided by the square root of n, converges to a constant. Determining the value of the constant is an open problem. We improve the lower bound over the original bound of Beardwood et al.

Finally, we study a statistical model: isotonic regression. Isotonic regression is the problem of estimating a coordinate-wise monotone function from data. We introduce the sparse version of the problem and study it in the high-dimensional setting. We provide optimization-based algorithms for the recovery of the ground truth function and provide guarantees for function estimation in terms of L2 loss.