Channel Comparison Methods and Statistical Problems on Graphs

Wednesday, April 26, 2023 - 10:30am

Event Calendar Category

LIDS Thesis Defense

Speaker Name

Yuzhou Gu

Building and Room Number

32-D677

Join Zoom meeting

https://mit.zoom.us/j/4257634786

Abstract

Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis explores several applications of these tools to statistical problems related to graphs. Part I focuses on information channels and channel comparison methods, including $f$-divergences, strong data processing inequalities, and preorders between channels. While these theories have been well-established for binary memoryless symmetric (BMS) channels, there remains much to discover for channels with larger input alphabets. We develop a theory of $q$-ary input-symmetric (FMS) channels, generalizing the theory of BMS channels. We demonstrate that while FMS channels exhibit more complex behavior than BMS channels, some properties of BMS channels can be extended to FMS channels. Furthermore, we perform tight analysis on contraction properties of the Potts channels, the simplest examples of FMS channels. In Part II, we apply the information theoretical methods established in Part I to solve problems related to random graph models with community structures. The random graph models include the stochastic block model (SBM) and its variants, which hold significance in statistics, machine learning, and network science. Central problems for these models ask about the feasibility and quality of recovering hidden community structures from unlabeled graphs. By utilizing the relationship between random graphs and random Galton-Watson trees, we demonstrate that many important problems on these graphical models can be reduced to problems on trees. We apply various channel comparison methods to solve these tree problems, demonstrating that different methods are effective for different problems and that selecting the correct tool for a problem is crucial. Problems we study include (for SBMs) weak recovery, optimal recovery algorithms, mutual information formula, and (for broadcasting on trees) reconstruction, robust reconstruction, uniqueness of belief propagation fixed points, boundary irrelevance, and so on.

 

Committee:
Prof. Yury Polyanskiy (thesis advisor)
Prof. Guy Bresler
Prof. Elchanan Mossel