Accelerating Drug Discovery by Early Protein Drug Target Prediction Based on a Multi-Fingerprint Similarity Search

Molecules. 2019 Jun 14;24(12):2233. doi: 10.3390/molecules24122233.

Abstract

In this continuing work, we have updated our recently proposed Multi-fingerprint Similarity Search algorithm (MuSSel) by enabling the generation of dominant ionized species at a physiological pH and the exploration of a larger data domain, which included more than half a million high-quality small molecules extracted from the latest release of ChEMBL (version 24.1, at the time of writing). Provided with a high biological assay confidence score, these selected compounds explored up to 2822 protein drug targets. To improve the data accuracy, samples marked as prodrugs or with equivocal biological annotations were not considered. Notably, MuSSel performances were overall improved by using an object-relational database management system based on PostgreSQL. In order to challenge the real effectiveness of MuSSel in predicting relevant therapeutic drug targets, we analyzed a pool of 36 external bioactive compounds published in the Journal of Medicinal Chemistry from October to December 2018. This study demonstrates that the use of highly curated chemical and biological experimental data on one side, and a powerful multi-fingerprint search algorithm on the other, can be of the utmost importance in addressing the fate of newly conceived small molecules, by strongly reducing the attrition of early phases of drug discovery programs.

Keywords: data quality; molecular similarity; multi-fingerprint; protein drug target prediction.

MeSH terms

  • Algorithms
  • Drug Discovery* / methods
  • Kinetics
  • Models, Chemical*
  • Models, Molecular*
  • Molecular Structure
  • Proteins / chemistry*
  • Quantitative Structure-Activity Relationship

Substances

  • Proteins