In silico categorization of in vivo intrinsic clearance using machine learning

Mol Pharm. 2013 Apr 1;10(4):1318-21. doi: 10.1021/mp300484r. Epub 2013 Mar 4.

Abstract

Machine learning has recently become popular and much used within the life science research domain, e.g., for finding quantitative structure-activity relationships (QSARs) between molecular structures and different biological end points. In the work presented here, we have applied orthogonal partial least-squares (OPLS), principal component analysis (PCA), and random forests (RF) methods for classification as well as regression analysis to a publicly available in vivo data set in order to assess the intrinsic metabolic clearance (CL(int)) in humans. The derived classification models are able to identify compounds with CL(int) lower and higher than 1500 mL/min, respectively, with nearly 80% accuracy. The most relevant descriptors are of lipophilicity and charge/polarizability types. Furthermore, the accuracy from a classification model based on regression analysis, using the 1500 mL/min cutoff, is also around 80%. These results suggest the usefulness of machine learning techniques to derive robust and predictive models in the area of in vivo ADMET (absorption, distribution, metabolism, elimination, and toxicity) modeling.

MeSH terms

  • Absorption
  • Artificial Intelligence*
  • Computer Simulation
  • Drug Design
  • Humans
  • Least-Squares Analysis
  • Models, Statistical
  • Principal Component Analysis
  • Quantitative Structure-Activity Relationship*
  • Regression Analysis
  • Reproducibility of Results
  • Software