New Developments and Possibilities in Reanalysis and Reinterpretation of Whole Exome Sequencing Datasets for Unsolved Rare Diseases Using Machine Learning Approaches

Int J Mol Sci. 2022 Jun 18;23(12):6792. doi: 10.3390/ijms23126792.

Abstract

Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20-30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics.

Keywords: machine learning; rare diseases; reanalysis.

Publication types

  • Review

MeSH terms

  • Exome Sequencing / methods
  • Exome* / genetics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Machine Learning
  • Rare Diseases* / diagnosis
  • Rare Diseases* / genetics