Integrating random walk with restart and k-Nearest Neighbor to identify novel circRNA-disease association

Sci Rep. 2020 Feb 6;10(1):1943. doi: 10.1038/s41598-020-59040-0.

Abstract

CircRNA is a special type of non-coding RNA, which is closely related to the occurrence and development of many complex human diseases. However, it is time-consuming and expensive to determine the circRNA-disease associations through experimental methods. Therefore, based on the existing databases, we propose a method named RWRKNN, which integrates the random walk with restart (RWR) and k-nearest neighbors (KNN) to predict the associations between circRNAs and diseases. Specifically, we apply RWR algorithm on weighting features with global network topology information, and employ KNN to classify based on features. Finally, the prediction scores of each circRNA-disease pair are obtained. As demonstrated by leave-one-out, 5-fold cross-validation and 10-fold cross-validation, RWRKNN achieves AUC values of 0.9297, 0.9333 and 0.9261, respectively. And case studies show that the circRNA-disease associations predicted by RWRKNN can be successfully demonstrated. In conclusion, RWRKNN is a useful method for predicting circRNA-disease associations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computational Biology / methods
  • Genetic Predisposition to Disease / genetics*
  • Humans
  • RNA, Circular / genetics*

Substances

  • RNA, Circular