Prediction of lncRNA-protein interactions using HeteSim scores based on heterogeneous networks

Sci Rep. 2017 Jun 16;7(1):3664. doi: 10.1038/s41598-017-03986-1.

Abstract

Massive studies have indicated that long non-coding RNAs (lncRNAs) are critical for the regulation of cellular biological processes by binding with RNA-related proteins. However, only a few experimentally supported lncRNA-protein associations have been reported. Existing network-based methods are typically focused on intrinsic features of lncRNA and protein but ignore the information implicit in the topologies of biological networks associated with lncRNAs. Considering the limitations in previous methods, we propose PLPIHS, an effective computational method for Predicting lncRNA-Protein Interactions using HeteSim Scores. PLPIHS uses the HeteSim measure to calculate the relatedness score for each lncRNA-protein pair in the heterogeneous network, which consists of lncRNA-lncRNA similarity network, lncRNA-protein association network and protein-protein interaction network. An SVM classifier to predict lncRNA-protein interactions is built with the HeteSim scores. The results show that PLPIHS performs significantly better than the existing state-of-the-art approaches and achieves an AUC score of 0.97 in the leave-one-out validation test. We also compare the performances of networks with different connectivity density and find that PLPIHS performs well across all the networks. Furthermore, we use the proposed method to identify the related proteins for lncRNA MALAT1. Highly-ranked proteins are verified by the biological studies and demonstrate the effectiveness of our method.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Genetic
  • Humans
  • Neural Networks, Computer
  • RNA, Long Noncoding / metabolism*
  • RNA-Binding Proteins / metabolism*
  • Reproducibility of Results
  • Software*

Substances

  • RNA, Long Noncoding
  • RNA-Binding Proteins