Using Sequence Similarity Based on CKSNP Features and a Graph Neural Network Model to Identify miRNA-Disease Associations

Genes (Basel). 2022 Sep 28;13(10):1759. doi: 10.3390/genes13101759.

Abstract

Among many machine learning models for analyzing the relationship between miRNAs and diseases, the prediction results are optimized by establishing different machine learning models, and less attention is paid to the feature information contained in the miRNA sequence itself. This study focused on the impact of the different feature information of miRNA sequences on the relationship between miRNA and disease. It was found that when the graph neural network used was the same and the miRNA features based on the K-spacer nucleic acid pair composition (CKSNAP) feature were adopted, a better graph neural network prediction model of miRNA-disease relationship could be built (AUC = 93.71%), which was 0.15% greater than the best model in the literature based on the same benchmark dataset. The optimized model was also used to predict miRNAs related to lung tumors, esophageal tumors, and kidney tumors, and 47, 47, and 37 of the top 50 miRNAs related to three diseases predicted separately by the model were consistent with descriptions in the wet experiment validation database (dbDEMC).

Keywords: graph auto-encoder; graph neural network; miRNA; miRNA sequence similarity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods
  • Esophageal Neoplasms*
  • Humans
  • Machine Learning
  • MicroRNAs* / genetics
  • Neural Networks, Computer

Substances

  • MicroRNAs

Grants and funding

This research was funded by National Natural Science Foundation of China (nos. 62001090) and Fundamental Research Funds for the Central Universities of Sichuan University (nos. YJ2021104).