Experimental design strategy: weak reinforcement leads to increased hit rates and enhanced chemical diversity

J Chem Inf Model. 2015 May 26;55(5):956-62. doi: 10.1021/acs.jcim.5b00054. Epub 2015 May 14.

Abstract

High Throughput Screening (HTS) is a common approach in life sciences to discover chemical matter that modulates a biological target or phenotype. However, low assay throughput, reagents cost, or a flowchart that can deal with only a limited number of hits may impair screening large numbers of compounds. In this case, a subset of compounds is assayed, and in silico models are utilized to aid in iterative screening design, usually to expand around the found hits and enrich subsequent rounds for relevant chemical matter. However, this may lead to an overly narrow focus, and the diversity of compounds sampled in subsequent iterations may suffer. Active learning has been recently successfully applied in drug discovery with the goal of sampling diverse chemical space to improve model performance. Here we introduce a robust and straightforward iterative screening protocol based on naı̈ve Bayes models. Instead of following up on the compounds with the highest scores in the in silico model, we pursue compounds with very low but positive values. This includes unique chemotypes of weakly active compounds that enhance the applicability domain of the model and increase the cumulative hit rates. We show in a retrospective application to 81 Novartis assays that this protocol leads to consistently higher compound and scaffold hit rates compared to a standard expansion around hits or an active learning approach. We recommend using the weak reinforcement strategy introduced herein for iterative screening workflows.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computer Simulation
  • Drug Evaluation, Preclinical / methods*
  • Machine Learning*