PrGeFNE: Predicting disease-related genes by fast network embedding

Methods. 2021 Aug:192:3-12. doi: 10.1016/j.ymeth.2020.06.015. Epub 2020 Jun 28.

Abstract

Identifying disease-related genes is of importance for understanding of molecule mechanisms of diseases, as well as diagnosis and treatment of diseases. Many computational methods have been proposed to predict disease-related genes, but how to make full use of multi-source biological data to enhance the ability of disease-gene prediction is still challenging. In this paper, we proposed a novel method for predicting disease-related genes by using fast network embedding (PrGeFNE), which can integrate multiple types of associations related to diseases and genes. Specifically, we first constructed a heterogeneous network by using phenotype-disease, disease-gene, protein-protein and gene-GO associations; and low-dimensional representation of nodes is extracted from the network by using a fast network embedding algorithm. Then, a dual-layer heterogeneous network was reconstructed by using the low-dimensional representation, and a network propagation was applied to the dual-layer heterogeneous network to predict disease-related genes. Through cross-validation and newly added-association validation, we displayed the important roles of different types of association data in enhancing the ability of disease-gene prediction, and confirmed the excellent performance of PrGeFNE by comparing to state-of-the-art algorithms. Furthermore, we developed a web tool that can facilitate researchers to search for candidate genes of different diseases predicted by PrGeFNE, along with the enrichment analysis of GO and pathway on candidate gene set. This may be useful for investigation of diseases' molecular mechanisms as well as their experimental validations. The web tool is available at http://bioinformatics.csu.edu.cn/prgefne/.

Keywords: Disease-gene prediction; Heterogeneous network; Multi-source biological data; Network embedding; Network propagation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology*
  • Proteins

Substances

  • Proteins