Application of non-parametric regression to quantitative structure-activity relationships

Bioorg Med Chem. 2002 Apr;10(4):1037-41. doi: 10.1016/s0968-0896(01)00359-5.

Abstract

Several non-parametric regressors have been applied to modelling quantitative structure-activity relationship (QSAR) data. Performances were benchmarked against multilinear regression and the nonlinear method of smoothing splines. Variable selection was explored through systematic combinations of different variables and combinations of principal components. For the training set examined--539 inhibitors of the tyrosine kinase, Syk--the best two-descriptor model had a 5-fold cross-validated q2 of 0.43. This was generated by a multi-variate Nadaraya-Watson kernel estimator. A subsequent, independent, test set of 371 similar chemical entities showed the model had some predictive power. Other approaches did not perform as well. A modest increase in predictive ability can be achieved with three descriptors, but the resulting model is less easy to visualise. We conclude that non-parametric regression offers a potentially powerful approach to identifying predictive, low-dimensional QSARs.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Enzyme Inhibitors / chemistry*
  • Enzyme Precursors / antagonists & inhibitors*
  • Intracellular Signaling Peptides and Proteins
  • Models, Chemical
  • Protein-Tyrosine Kinases / antagonists & inhibitors*
  • Quantitative Structure-Activity Relationship*
  • Regression Analysis
  • Syk Kinase

Substances

  • Enzyme Inhibitors
  • Enzyme Precursors
  • Intracellular Signaling Peptides and Proteins
  • Protein-Tyrosine Kinases
  • Syk Kinase