PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing

Ryo Kurosawa; Kei Iida; Masahiko Ajiro; Tomonari Awaya; Mamiko Yamada; Kenjiro Kosaki; Masatoshi Hagiwara

doi:10.1186/s12864-023-09645-2

PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing

BMC Genomics. 2023 Oct 10;24(1):601. doi: 10.1186/s12864-023-09645-2.

Authors

Ryo Kurosawa¹, Kei Iida^{2

3}, Masahiko Ajiro^{4

5}, Tomonari Awaya^{6

7}, Mamiko Yamada⁸, Kenjiro Kosaki⁸, Masatoshi Hagiwara⁹

Affiliations

¹ Department of Anatomy and Developmental Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan. kurosawa.ryo.43r@st.kyoto-u.ac.jp.
² Faculty of Science and Engineering, Kindai University, 3-4-1 Kowakae, Higashi-osaka, Osaka, 577-8502, Japan.
³ Medical Research Support Center, Graduate School of Medicine, Kyoto University, Yoshida- Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan.
⁴ Division of Cancer RNA Research, National Cancer Center Research Institute, Tokyo, 104- 0045, Japan.
⁵ Department of Drug Discovery Medicine, Graduate School of Medicine, Kyoto University, Yoshida Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan.
⁶ Department of Anatomy and Developmental Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan.
⁷ Laboratory of Tumor Microenvironment and Immunity, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan.
⁸ Center for Medical Genetics, Keio University School of Medicine, Tokyo, 160-8582, Japan.
⁹ Department of Anatomy and Developmental Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto, 606-8501, Japan. hagiwara.masatoshi.8c@kyoto-u.ac.jp.

Abstract

Background: Deep-intronic variants that alter RNA splicing were ineffectively evaluated in the search for the cause of genetic diseases. Determination of such pathogenic variants from a vast number of deep-intronic variants (approximately 1,500,000 variants per individual) represents a technical challenge to researchers. Thus, we developed a Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing (PDIVAS) to easily detect pathogenic deep-intronic variants.

Results: PDIVAS was trained on an ensemble machine-learning algorithm to classify pathogenic and benign variants in a curated dataset. The dataset consists of manually curated pathogenic splice-altering variants (SAVs) and commonly observed benign variants within deep introns. Splicing features and a splicing constraint metric were used to maximize the predictive sensitivity and specificity, respectively. PDIVAS showed an average precision of 0.92 and a maximum MCC of 0.88 in classifying these variants, which were the best of the previous predictors. When PDIVAS was applied to genome sequencing analysis on a threshold with 95% sensitivity for reported pathogenic SAVs, an average of 27 pathogenic candidates were extracted per individual. Furthermore, the causative variants in simulated patient genomes were more efficiently prioritized than the previous predictors.

Conclusion: Incorporating PDIVAS into variant interpretation pipelines will enable efficient detection of disease-causing deep-intronic SAVs and contribute to improving the diagnostic yield. PDIVAS is publicly available at https://github.com/shiro-kur/PDIVAS .

Keywords: Deep intron; Genomics; Machine learning; Non-coding region; Pathogenicity prediction; RNA splicing; Variant interpretation.

MeSH terms

Humans
Introns
Mutation
RNA Splicing*
Virulence

Abstract

MeSH terms

Grants and funding