Small Molecular Drug Screening Based on Clinical Therapeutic Effect

Molecules. 2022 Jul 27;27(15):4807. doi: 10.3390/molecules27154807.

Abstract

Virtual screening can significantly save experimental time and costs for early drug discovery. Drug multi-classification can speed up virtual screening and quickly predict the most likely class for a drug. In this study, 1019 drug molecules with actual therapeutic effects are collected from multiple databases and documents, and molecular sets are grouped according to therapeutic effect and mechanism of action. Molecular descriptors and molecular fingerprints are obtained through SMILES to quantify molecular structures. After using the Kennard-Stone method to divide the data set, a better combination can be obtained by comparing the combined results of five classification algorithms and a fusion method. Furthermore, for a specific data set, the model with the best performance is used to predict the validation data set. The test set shows that prediction accuracy can reach 0.862 and kappa coefficient can reach 0.808. The highest classification accuracy of the validation set is 0.873. The more reliable molecular set has been found, which could be used to predict potential attributes of unknown drug compounds and even to discover new use for old drugs. We hope this research can provide a reference for virtual screening of multiple classes of drugs at the same time in the future.

Keywords: Dempster–Shafer theory; Kennard–Stone division; molecular descriptor; molecular fingerprint.

MeSH terms

  • Algorithms*
  • Databases, Factual
  • Drug Discovery* / methods
  • Drug Evaluation, Preclinical / methods
  • Molecular Structure