Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models

Wednesday, April 4, 2018 - 4:30pm to Thursday, April 5, 2018 - 4:55pm

Event Calendar Category

LIDS & Stats Tea

Speaker Name

Raj Agrawal

Affiliation

LIDS

Building and Room Number

LIDS Lounge

Learning a Bayesian network (BN) from data can be useful for decision-making or discovering causal relationships, but traditional methods can fail in modern applications, which often exhibit a larger number of observed variables than data points. The resulting uncertainty about the underlying network as well as the ability to incorporate prior information recommend a Bayesian approach to learning the BN, but the highly combinatorial structure of BNs poses a striking challenge for inference. The current state-of-the-art method, order MCMC, is faster than previous methods but prevents the use of many natural structural priors and still has running time exponential in the maximum indegree of the true directed acyclic graph (DAG) of the BN. We here propose an alternative posterior approximation based on the observation that, if we incorporate empirical conditional independence tests, we can focus on a high-probability DAG associated with each permutation. We show that our method allows the desired flexibility in prior specification, removes timing dependence on the maximum indegree, yields provably good posterior approximations, and achieves superior accuracy, scalability, and sampler mixing on several datasets.