Predicting mutagenicity of aromatic amines by various machine learning approaches

Toxicol Sci. 2010 Aug;116(2):498-513. doi: 10.1093/toxsci/kfq159. Epub 2010 May 27.

Abstract

Aromatic amines are prevalently used in a wide variety of industries and are ubiquitous in foods and environment. Many of this class of compounds are potentially mutagenic or even carcinogenic, and the assessment and prediction of their mutagenicity are of practical importance because mutagenicity and carcinogenicity are toxicological end points that play major roles in the genesis of cancer and tumor. Quantitative structure-activity relationship of a homogeneous set of mutagenicity data (TA98 + S9), which was comprehensively compiled from literature, was developed by four machine learning methods, namely hierarchical support vector regression (HSVR), support vector machine, radial basis function neural networks, and genetic function algorithm. The predictions by these models are in good agreement with the experimental observations for those molecules in the training set (n = 97, r(2) = 0.78-0.93, q(2) = 0.64-0.93, root mean square error [RMSE] = 0.51-0.90, SD = 0.34-0.56) and the test set (n = 25, r(2) = 0.73-0.85, RMSE = 0.65-0.85, SD = 0.33-0.51). In addition, several validation criteria were adopted to verify those generated models, and a set of outliers was deliberately selected to examine the robustness of these four predictive models (n = 14, r(2) = 0.35-0.84, RMSE = 0.55-1.21, SD = 0.25-0.72). Finally, various cross-comparison schemes, namely forward comparisons, backward comparisons, and most common molecule comparisons, with assorted published predictive models were carried out. Our results indicate that the HSVR model is the most accurate, robust, and consistent and can be employed as a tool for predicting mutagenicity of aromatic amines.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amines / toxicity*
  • Artificial Intelligence*
  • Models, Statistical
  • Mutagenicity Tests
  • Mutagens / toxicity*
  • Neural Networks, Computer
  • Quantitative Structure-Activity Relationship

Substances

  • Amines
  • Mutagens