Prediction Accuracy of Production ADMET Models as a Function of Version: Activity Cliffs Rule

Robert P Sheridan; J Chris Culberson; Elizabeth Joshi; Matthew Tudor; Prabha Karnachi

doi:10.1021/acs.jcim.2c00699

Prediction Accuracy of Production ADMET Models as a Function of Version: Activity Cliffs Rule

J Chem Inf Model. 2022 Jul 25;62(14):3275-3280. doi: 10.1021/acs.jcim.2c00699. Epub 2022 Jul 7.

Authors

Robert P Sheridan¹, J Chris Culberson¹, Elizabeth Joshi¹, Matthew Tudor¹, Prabha Karnachi¹

Affiliation

¹ Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States.

PMID: 35796226
DOI: 10.1021/acs.jcim.2c00699

Abstract

As with many other institutions, our company maintains many quantitative structure-activity relationship (QSAR) models of absorption, distribution, metabolism, excretion, and toxicity (ADMET) end points and updates the models regularly. We recently examined version-to-version predictivity for these models over a period of 10 years. In this approach we monitor the goodness of prediction of new molecules relative to the training set of model version V before they are incorporated in the updated model V+1. Using a cell-based permeability assay (Papp) as an example, we illustrate how the QSAR models made from this data are generally predictive and can be utilized to enrich chemical designs and synthesis. Despite the obvious utility of these models, we turned up unexpected behavior in Papp and other ADMET activities for which the explanation is not obvious. One such behavior is that the apparent predictivity of the models as measured by root-mean-square-error can vary greatly from version to version and is sometimes very poor. One intuitively appealing explanation is that the observed activities of the new molecules fall outside the bulk of activities in the training set. Alternatively, one may think that the new molecules are exploring different regions of chemical space than the training set. However, the real explanation has to do with activity cliffs. If the observed activities of the new molecules are different than expected based on similar molecules in the training set, the predictions will be less accurate. This is true for all our ADMET end points.

Publication types

Review

MeSH terms

Quantitative Structure-Activity Relationship*