Prediction of aquatic toxicity mode of action using linear discriminant and random forest models

J Chem Inf Model. 2013 Sep 23;53(9):2229-39. doi: 10.1021/ci400267h. Epub 2013 Aug 20.

Abstract

The ability to determine the mode of action (MOA) for a diverse group of chemicals is a critical part of ecological risk assessment and chemical regulation. However, existing MOA assignment approaches in ecotoxicology have been limited to a relatively few MOAs, have high uncertainty, or rely on professional judgment. In this study, machine based learning algorithms (linear discriminant analysis and random forest) were used to develop models for assigning aquatic toxicity MOA. These methods were selected since they have been shown to be able to correlate diverse data sets and provide an indication of the most important descriptors. A data set of MOA assignments for 924 chemicals was developed using a combination of high confidence assignments, international consensus classifications, ASTER (ASessment Tools for the Evaluation of Risk) predictions, and weight of evidence professional judgment based an assessment of structure and literature information. The overall data set was randomly divided into a training set (75%) and a validation set (25%) and then used to develop linear discriminant analysis (LDA) and random forest (RF) MOA assignment models. The LDA and RF models had high internal concordance and specificity and were able to produce overall prediction accuracies ranging from 84.5 to 87.7% for the validation set. These results demonstrate that computational chemistry approaches can be used to determine the acute toxicity MOAs across a large range of structures and mechanisms.

MeSH terms

  • Aquatic Organisms / drug effects*
  • Computational Biology / methods*
  • Discriminant Analysis
  • Quantitative Structure-Activity Relationship
  • Reproducibility of Results
  • Toxicity Tests*