Assessing predictions of the impact of variants on splicing in CAGI5

Stephen M Mount; Žiga Avsec; Liran Carmel; Rita Casadio; Muhammed Hasan Çelik; Ken Chen; Jun Cheng; Noa E Cohen; William G Fairbrother; Tzila Fenesh; Julien Gagneur; Valer Gotea; Tamar Holzer; Chiao-Feng Lin; Pier Luigi Martelli; Tatsuhiko Naito; Thi Yen Duong Nguyen; Castrense Savojardo; Ron Unger; Robert Wang; Yuedong Yang; Huiying Zhao

doi:10.1002/humu.23869

Assessing predictions of the impact of variants on splicing in CAGI5

Hum Mutat. 2019 Sep;40(9):1215-1224. doi: 10.1002/humu.23869. Epub 2019 Aug 19.

Authors

Stephen M Mount¹, Žiga Avsec², Liran Carmel³, Rita Casadio⁴, Muhammed Hasan Çelik², Ken Chen⁵, Jun Cheng², Noa E Cohen^{3

6}, William G Fairbrother⁷, Tzila Fenesh⁸, Julien Gagneur², Valer Gotea⁹, Tamar Holzer⁸, Chiao-Feng Lin¹⁰, Pier Luigi Martelli⁴, Tatsuhiko Naito¹¹, Thi Yen Duong Nguyen², Castrense Savojardo⁴, Ron Unger⁸, Robert Wang^{12

13}, Yuedong Yang⁵, Huiying Zhao¹⁴

Affiliations

¹ Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland.
² Department of Informatics, Technical University of Munich, Garching, Germany.
³ Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
⁴ Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy.
⁵ School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China.
⁶ The integrated program for Computer Science and Computational Biology, School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
⁷ Department of Molecular Biology, Cell Biology, and Biochemistry, Center For Computational Biology, Brown University, Providence, Rhode Island.
⁸ The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel.
⁹ National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH), Bethesda, Maryland.
¹⁰ Translational Informatics, DNAnexus, Mountain View, California.
¹¹ Department of Neurology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
¹² Department of Bioengineering, University of California, Berkeley, California.
¹³ Department of Plant and Molecular Biology, University of California, Berkeley, California.
¹⁴ Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.

Abstract

Precision medicine and sequence-based clinical diagnostics seek to predict disease risk or to identify causative variants from sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. In the past, few CAGI challenges have addressed the impact of sequence variants on splicing. In CAGI5, two challenges (Vex-seq and MaPSY) involved prediction of the effect of variants, primarily single-nucleotide changes, on splicing. Although there are significant differences between these two challenges, both involved prediction of results from high-throughput exon inclusion assays. Here, we discuss the methods used to predict the impact of these variants on splicing, their performance, strengths, and weaknesses, and prospects for predicting the impact of sequence variation on splicing and disease phenotypes.

Keywords: CAGI experiment; machine learning; mutation; splicing; variant interpretation.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Alternative Splicing*
Animals
Computational Biology / methods*
Congresses as Topic
Genetic Fitness
Humans
Models, Genetic
Mutation*
Proteins / genetics*
Sequence Homology, Nucleic Acid

Substances

Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding