In Silico Prediction of Endocrine Disrupting Chemicals Using Single-Label and Multilabel Models

J Chem Inf Model. 2019 Mar 25;59(3):973-982. doi: 10.1021/acs.jcim.8b00551. Epub 2019 Mar 11.

Abstract

Endocrine disruption (ED) has become a serious public health issue and also poses a significant threat to the ecosystem. Due to complex mechanisms of ED, traditional in silico models focusing on only one mechanism are insufficient for detection of endocrine disrupting chemicals (EDCs), let alone offering an overview of possible action mechanisms for a known EDC. To remove these limitations, in this study both single-label and multilabel models were constructed across six ED targets, namely, AR (androgen receptor), ER (estrogen receptor alpha), TR (thyroid receptor), GR (glucocorticoid receptor), PPARg (peroxisome proliferator-activated receptor gamma), and aromatase. Two machine learning methods were used to build the single-label models, with multiple random under-sampling combining voting classification to overcome the challenge of data imbalance. Four methods were explored to construct the multilabel models that can predict the interaction of one EDC against multiple targets simultaneously. The single-label models of all the six targets have achieved reasonable performance with balanced accuracy (BA) values from 0.742 to 0.816. Each top single-label model was then joined to predict the multilabel test set with BA values from 0.586 to 0.711. The multilabel models could offer a significant boost over the single-label baselines with BA values for the multilabel test set from 0.659 to 0.832. Therefore, we concluded that single-label models could be employed for identification of potential EDCs, while multilabel ones are preferable for prediction of possible mechanisms of known EDCs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation*
  • Drug Evaluation, Preclinical
  • Endocrine Disruptors / pharmacology*

Substances

  • Endocrine Disruptors