Recent advances in network-based methods for disease gene prediction

Brief Bioinform. 2021 Jul 20;22(4):bbaa303. doi: 10.1093/bib/bbaa303.

Abstract

Disease-gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease-gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease-gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14 state-of-the-art methods. To summarize, we first elucidate the task definition for disease gene prediction. Secondly, we categorize existing network-based efforts into network diffusion methods, traditional machine learning methods with handcrafted graph features and graph representation learning methods. Thirdly, an empirical analysis is conducted to evaluate the performance of the selected methods across seven diseases. We also provide distinguishing findings about the discussed methods based on our empirical analysis. Finally, we highlight potential research directions for future studies on disease gene prediction.

Keywords: disease gene prediction; graph representation learning; network-based methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Nucleic Acid*
  • Disease / genetics*
  • Genome-Wide Association Study
  • Humans
  • Machine Learning*
  • Polymorphism, Single Nucleotide*