In silico prediction of chemical reproductive toxicity using machine learning

J Appl Toxicol. 2019 Jun;39(6):844-854. doi: 10.1002/jat.3772. Epub 2019 Jan 27.

Abstract

Reproductive toxicity is an important regulatory endpoint in health hazard assessment. Because the in vivo tests are expensive, time consuming and require a large number of animals, which must be killed, in silico approaches as the alternative strategies have been developed to assess the potential reproductive toxicity (reproductive toxicity) of chemicals. Some prediction models for reproductive toxicity have been developed, but most of them were built only based on one single endpoint such as embryo teratogenicity; therefore, these models may not provide reliable predictions for toxic chemicals with other endpoints, such as sperm reduction or gonadal dysgenesis. Here, a total of 1823 chemicals for reproductive toxicity characterized by multiple endpoints were used to develop structure-activity relationship models by six machine-learning approaches with nine molecular fingerprints. Among the models, MACCSFP-SVM model has the best performance for the external validation set (area under the curve = 0.900, classification accuracy = 0.836). The applicability domain was analyzed, and a rational boundary was found to distinguish inaccurate predictions and accurate predictions. Moreover, several structural alerts for characterizing reproductive toxicity were identified using the information gain combining substructure frequency analysis. Our results would be helpful for the prediction of the reproductive toxicity of chemicals.

Keywords: machine learning; molecular fingerprint; reproductive toxicity; structural alerts; structure-activity relationship.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computer Simulation
  • Datasets as Topic
  • Machine Learning*
  • Reproduction / drug effects*
  • Structure-Activity Relationship