Novel Potential Small Molecule-MiRNA-Cancer Associations Prediction Model Based on Fingerprint, Sequence, and Clinical Symptoms

J Chem Inf Model. 2021 May 24;61(5):2208-2219. doi: 10.1021/acs.jcim.0c01458. Epub 2021 Apr 26.

Abstract

As an important biomarker in organisms, miRNA is closely related to various small molecules and diseases. Research on small molecule-miRNA-cancer associations is helpful for the development of cancer treatment drugs and the discovery of pathogenesis. It is very urgent to develop theoretical methods for identifying potential small molecular-miRNA-cancer associations, because experimental approaches are usually time-consuming, laborious, and expensive. To overcome this problem, we developed a new computational method, in which features derived from structure, sequence, and symptoms were utilized to characterize small molecule, miRNA, and cancer, respectively. A feature vector was construct to characterize small molecule-miRNA-cancer association by concatenating these features, and a random forest algorithm was utilized to construct a model for recognizing potential association. Based on the 5-fold cross-validation and benchmark data set, the model achieved an accuracy of 93.20 ± 0.52%, a precision of 93.22 ± 0.51%, a recall of 93.20 ± 0.53%, and an F1-measure of 93.20 ± 0.52%. The areas under the receiver operating characteristic curve and precision recall curve were 0.9873 and 0.9870. The real prediction ability and application performance of the developed method have also been further evaluated and verified through an independent data set test and case study. Some potential small molecules and miRNAs related to cancer have been identified and are worthy of further experimental research. It is anticipated that our model could be regarded as a useful high-throughput virtual screening tool for drug research and development. All source codes can be downloaded from https://github.com/LeeKamlong/Multi-class-SMMCA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology
  • Humans
  • MicroRNAs* / genetics
  • Neoplasms* / drug therapy
  • Neoplasms* / genetics
  • ROC Curve

Substances

  • MicroRNAs