LMGATCDA: Graph Neural Network With Labeling Trick for Predicting circRNA-Disease Associations

IEEE/ACM Trans Comput Biol Bioinform. 2024 Mar-Apr;21(2):289-300. doi: 10.1109/TCBB.2024.3355093. Epub 2024 Apr 3.

Abstract

Previous studies have proven that circular RNAs (circRNAs) are inextricably connected to the etiology and pathophysiology of complicated diseases. Since conventional biological research are frequently small-scale, expensive, and time-consuming, it is essential to establish an efficient and reasonable computation-based method to identify disease-related circRNAs. In this article, we proposed a novel ensemble model for predicting probable circRNA-disease associations based on multi-source similarity information(LMGATCDA). In particular, LMGATCDA first incorporates information on circRNA functional similarity, disease semantic similarity, and the Gaussian interaction profile (GIP) kernel similarity as explicit features, along with node-labeling of the three-hop subgraphs extracted from each linked target node as graph structural features. After that, the fused features are used as input, and further implied features are extracted by graph sampling aggregation (GraphSAGE) and multi-hop attention graph neural network (MAGNA). Finally, the prediction scores are obtained through a fully connected layer. With five-fold cross-validation, LMGATCDA demonstrated excellent competitiveness against gold standard data, reaching 95.37% accuracy and 91.31% recall with an AUC of 94.25% on the circR2Disease benchmark dataset. Collectively, the noteworthy findings from these case studies support our conclusion that the LMGATCDA model can provide reliable circRNA-disease associations for clinical research while helping to mitigate experimental uncertainties in wet-lab investigations.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Neural Networks, Computer*
  • RNA, Circular* / genetics

Substances

  • RNA, Circular