A Semi-automated Approach to Create Purposeful Mechanistic Datasets from Heterogeneous Data: Data Mining Towards the in silico Predictions for Oestrogen Receptor Modulation and Teratogenicity

Mol Inform. 2017 Aug;36(8). doi: 10.1002/minf.201600154. Epub 2017 Apr 24.

Abstract

The need to find an alternative to costly animal studies for developmental and reproductive toxicity testing has shifted the focus considerably to the assessment of in vitro developmental toxicology models and the exploitation of pharmacological data for relevant molecular initiating events. We hereby demonstrate how automation can be applied successfully to handle heterogeneous oestrogen receptor data from ChEMBL. Applying expert-derived thresholds to specific bioactivities allowed an activity call to be attributed to each data entry. Human intervention further improved this mechanistic dataset which was mined to develop structure-activity relationship alerts and an expert model covering 45 chemical classes for the prediction of oestrogen receptor modulation. The evaluation of the model using FDA EDKB and Tox21 data was quite encouraging. This model can also provide a teratogenicity prediction along with the additional information it provides relevant to the query compound, all of which will require careful assessment of potential risk by experts.

Keywords: Derek Nexus; bioactivities; data-mining; expert activity calls; expert rules; molecular initiating event; oestrogen receptor; semi-automation; structural-activity relationships.

MeSH terms

  • Cluster Analysis
  • Computer Simulation
  • Data Mining* / methods
  • Estrogen Receptor Modulators / chemistry*
  • Estrogen Receptor Modulators / pharmacology*
  • Models, Biological*
  • Models, Molecular*
  • Molecular Structure
  • Receptors, Estrogen / chemistry*
  • Receptors, Estrogen / metabolism*
  • Structure-Activity Relationship
  • Teratogenesis*
  • Workflow

Substances

  • Estrogen Receptor Modulators
  • Receptors, Estrogen