Assessing predictions of the impact of variants on splicing in CAGI5

Hum Mutat. 2019 Sep;40(9):1215-1224. doi: 10.1002/humu.23869. Epub 2019 Aug 19.

Abstract

Precision medicine and sequence-based clinical diagnostics seek to predict disease risk or to identify causative variants from sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. In the past, few CAGI challenges have addressed the impact of sequence variants on splicing. In CAGI5, two challenges (Vex-seq and MaPSY) involved prediction of the effect of variants, primarily single-nucleotide changes, on splicing. Although there are significant differences between these two challenges, both involved prediction of results from high-throughput exon inclusion assays. Here, we discuss the methods used to predict the impact of these variants on splicing, their performance, strengths, and weaknesses, and prospects for predicting the impact of sequence variation on splicing and disease phenotypes.

Keywords: CAGI experiment; machine learning; mutation; splicing; variant interpretation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alternative Splicing*
  • Animals
  • Computational Biology / methods*
  • Congresses as Topic
  • Genetic Fitness
  • Humans
  • Models, Genetic
  • Mutation*
  • Proteins / genetics*
  • Sequence Homology, Nucleic Acid

Substances

  • Proteins