gGATLDA: lncRNA-disease association prediction based on graph-level graph attention network

BMC Bioinformatics. 2022 Jan 4;23(1):11. doi: 10.1186/s12859-021-04548-z.

Abstract

Background: Long non-coding RNAs (lncRNAs) are related to human diseases by regulating gene expression. Identifying lncRNA-disease associations (LDAs) will contribute to diagnose, treatment, and prognosis of diseases. However, the identification of LDAs by the biological experiments is time-consuming, costly and inefficient. Therefore, the development of efficient and high-accuracy computational methods for predicting LDAs is of great significance.

Results: In this paper, we propose a novel computational method (gGATLDA) to predict LDAs based on graph-level graph attention network. Firstly, we extract the enclosing subgraphs of each lncRNA-disease pair. Secondly, we construct the feature vectors by integrating lncRNA similarity and disease similarity as node attributes in subgraphs. Finally, we train a graph neural network (GNN) model by feeding the subgraphs and feature vectors to it, and use the trained GNN model to predict lncRNA-disease potential association scores. The experimental results show that our method can achieve higher area under the receiver operation characteristic curve (AUC), area under the precision recall curve (AUPR), accuracy and F1-Score than the state-of-the-art methods in five fold cross-validation. Case studies show that our method can effectively identify lncRNAs associated with breast cancer, gastric cancer, prostate cancer, and renal cancer.

Conclusion: The experimental results indicate that our method is a useful approach for predicting potential LDAs.

Keywords: Disease similarity based on gene–gene interaction network; Gaussian interaction profile kernel similarity of lncRNAs; Graph attention network; lncRNA-disease association prediction.

MeSH terms

  • Algorithms
  • Area Under Curve
  • Computational Biology
  • Humans
  • Neoplasms / genetics*
  • Neural Networks, Computer
  • RNA, Long Noncoding* / genetics

Substances

  • RNA, Long Noncoding