Among many machine learning models for analyzing the relationship between miRNAs and diseases, the prediction results are optimized by establishing different machine learning models, and less attention is paid to the feature information contained in the miRNA sequence itself. This study focused on the impact of the different feature information of miRNA sequences on the relationship between miRNA and disease. It was found that when the graph neural network used was the same and the miRNA features based on the K-spacer nucleic acid pair composition (CKSNAP) feature were adopted, a better graph neural network prediction model of miRNA-disease relationship could be built (AUC = 93.71%), which was 0.15% greater than the best model in the literature based on the same benchmark dataset. The optimized model was also used to predict miRNAs related to lung tumors, esophageal tumors, and kidney tumors, and 47, 47, and 37 of the top 50 miRNAs related to three diseases predicted separately by the model were consistent with descriptions in the wet experiment validation database (dbDEMC).
Keywords: graph auto-encoder; graph neural network; miRNA; miRNA sequence similarity.