The Derivation of a Matched Molecular Pairs Based ADME/Tox Knowledge Base for Compound Optimization

J Chem Inf Model. 2020 Oct 26;60(10):4757-4771. doi: 10.1021/acs.jcim.0c00583. Epub 2020 Oct 6.

Abstract

Matched Molecular Pairs (MMP) analysis is a well-established technique for Structure Activity and Property Analysis (SAR and SPR). Summarizing multiple MMPs that describe the same structural change into a single chemical transform can be a powerful tool for prediction (termed Transform from here on). This is particularly useful in the area of Absorption, Distribution, Metabolism, and Elimination (ADME) analysis that is less influenced by 3D structural binding effects. The creation of a knowledge database containing many of these Transforms across typical ADME assays promises to be a powerful approach to aid multidimensional optimization. We present a detailed workflow for the derivation of such a database. We include details of an MMP fragmentation algorithm with associated statistical summarization methods for the derivation of Transforms. This is made freely available as part of the LillyMol software package. We describe the application of this method to several ADME/Tox (Toxicity) assay data sets and highlight multiple cases where the impact of traditional medicinal chemistry Transforms is contradicted by MMP data. We also describe the internal software interface used by medicinal chemists to aid the design of new compounds via automated suggestion. This approach utilizes the matched pairs database to "suggest" improved compounds in an automated design scenario. A nonvisual script-based version of the automated suggestions code with an associated set of described chemical Transforms is also made freely available along with this paper and as part of the LillyMol software package. Finally, we contrast this knowledge database against a larger database of all MMPs derived from a 2 million compound diversity set and a subset of MMPs seen in historical discovery projects. The comparison against all transforms in the diversity collection highlights the very low coverage of the transform database as compared to all possible transforms involving 15 atom fragments. The comparison against a smaller subset of Transforms seen on internal Medicinal Chemistry projects shows better coverage of the transform database for a small set of common medicinal chemistry strategies. Within the context of all possible transforms available to a medicinal chemistry project team, the challenge remains to move beyond mere idea generation from past projects toward high quality prediction for novel ADME/Tox modulating Transforms.

MeSH terms

  • Algorithms*
  • Chemistry, Pharmaceutical
  • Databases, Factual
  • Knowledge Bases
  • Software*