BEATRICE: Bayesian Fine-mapping from Summary Data using Deep Variational Inference

bioRxiv [Preprint]. 2023 Dec 14:2023.03.24.534116. doi: 10.1101/2023.03.24.534116.

Abstract

We introduce a novel framework BEATRICE to identify putative causal variants from GWAS summary statistics (https://github.com/sayangsep/Beatrice-Finemapping). Identifying causal variants is challenging due to their sparsity and to highly correlated variants in the nearby regions. To account for these challenges, our approach relies on a hierarchical Bayesian model that imposes a binary concrete prior on the set of causal variants. We derive a variational algorithm for this fine-mapping problem by minimizing the KL divergence between an approximate density and the posterior probability distribution of the causal configurations. Correspondingly, we use a deep neural network as an inference machine to estimate the parameters of our proposal distribution. Our stochastic optimization procedure allows us to simultaneously sample from the space of causal configurations. We use these samples to compute the posterior inclusion probabilities and determine credible sets for each causal variant. We conduct a detailed simulation study to quantify the performance of our framework across different numbers of causal variants and different noise paradigms, as defined by the relative genetic contributions of causal and non-causal variants. Using this simulated data, we perform a comparative analysis against two state-of-the-art baseline methods for fine-mapping. We demonstrate that BEATRICE achieves uniformly better coverage with comparable power and set sizes, and that the performance gain increases with the number of causal variants. Thus, BEATRICE is a valuable tool to identify causal variants from eQTL and GWAS summary statistics across complex diseases and traits.

Publication types

  • Preprint