A Bayesian framework for efficient and accurate variant prediction

PLoS One. 2018 Sep 13;13(9):e0203553. doi: 10.1371/journal.pone.0203553. eCollection 2018.

Abstract

There is a growing need to develop variant prediction tools capable of assessing a wide spectrum of evidence. We present a Bayesian framework that involves aggregating pathogenicity data across multiple in silico scores on a gene-by-gene basis and multiple evidence statistics in both quantitative and qualitative forms, and performs 5-tiered variant classification based on the resulting probability credible interval. When evaluated in 1,161 missense variants, our gene-specific in silico model-based meta-predictor yielded an area under the curve (AUC) of 96.0% and outperformed all other in silico predictors. Multifactorial model analysis incorporating all available evidence yielded 99.7% AUC, with 22.8% predicted as variants of uncertain significance (VUS). Use of only 3 auto-computed evidence statistics yielded 98.6% AUC with 56.0% predicted as VUS, which represented sufficient accuracy to rapidly assign a significant portion of VUS to clinically meaningful classifications. Collectively, our findings support the use of this framework to conduct large-scale variant prioritization using in silico predictors followed by variant prediction and classification with a high degree of predictive accuracy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Bayes Theorem*
  • Genetic Predisposition to Disease / genetics
  • Genetic Testing
  • Mutation, Missense / genetics

Grants and funding

All authors are employees of Ambry Genetics, therefore the work was funded by Ambry Genetics. The funder had no role in study design, data collection, methodology development and analysis, decision to publish, or preparation of the manuscript.