Learning Representation of Molecules in Association Network for Predicting Intermolecular Associations

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2546-2554. doi: 10.1109/TCBB.2020.2973091. Epub 2021 Dec 8.

Abstract

A key aim of post-genomic biomedical research is to systematically understand molecules and their interactions in human cells. Multiple biomolecules coordinate to sustain life activities, and interactions between various biomolecules are interconnected. However, existing studies usually only focusing on associations between two or very limited types of molecules. In this study, we propose a network representation learning based computational framework MAN-SDNE to predict any intermolecular associations. More specifically, we constructed a large-scale molecular association network of multiple biomolecules in human by integrating associations among long non-coding RNA, microRNA, protein, drug, and disease, containing 6,528 molecular nodes, 9 kind of,105,546 associations. And then, the feature of each node is represented by its network proximity and attribute features. Furthermore, these features are used to train Random Forest classifier to predict intermolecular associations. MAN-SDNE achieves a remarkable performance with an AUC of 0.9552 and an AUPR of 0.9338 under five-fold cross-validation. To indicate the ability to predict specific types of interactions, a case study for predicting lncRNA-protein interactions using MAN-SDNE is also executed. Experimental results demonstrate this work offers a systematic insight for understanding the synergistic associations between molecules and complex diseases and provides a network-based computational tool to systematically explore intermolecular interactions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Humans
  • MicroRNAs / genetics
  • MicroRNAs / metabolism
  • Models, Biological*
  • Pharmaceutical Preparations / metabolism
  • RNA, Long Noncoding / genetics
  • RNA, Long Noncoding / metabolism
  • Systems Biology / methods*

Substances

  • MicroRNAs
  • Pharmaceutical Preparations
  • RNA, Long Noncoding