Random forest prediction of mutagenicity from empirical physicochemical descriptors

J Chem Inf Model. 2007 Jan-Feb;47(1):1-8. doi: 10.1021/ci050520j.

Abstract

Fast-to-calculate empirical physicochemical descriptors were investigated for their ability to predict mutagenicity (positive or negative Ames test) from the molecular structure. Fast methods are highly desired for the screening of large libraries of compounds. Global molecular descriptors and MOLMAP descriptors of bond properties were used to train random forests. Error percentages as low as 15% and 16% were achieved for an external test set with 472 compounds and for the training set with 4083 structures, respectively. High sensitivity and specificity were observed. Random forests were able to associate meaningful probabilities to the predictions and to explain the predictions in terms of similarities between query structures and compounds in the training set.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Artificial Intelligence*
  • Carcinogens / chemistry*
  • Computer Simulation
  • Drug Design*
  • Humans
  • Models, Statistical*
  • Molecular Structure
  • Mutagens / chemistry*
  • Probability
  • Quantitative Structure-Activity Relationship*

Substances

  • Carcinogens
  • Mutagens