Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments

J Comput Chem. 2007 Apr 30;28(6):1042-8. doi: 10.1002/jcc.20649.

Abstract

Three-dimensional (3D) protein structures now frequently lack functional annotations because of the increase in the rate at which chemical structures are solved with respect to experimental knowledge of biological activity. As a result, predicting structure-function relationships for proteins is an active research field in computational chemistry and has implications in medicinal chemistry, biochemistry and proteomics. In previous studies stochastic spectral moments were used to predict protein stability or function (González-Díaz, H. et al. Bioorg Med Chem 2005, 13, 323; Biopolymers 2005, 77, 296). Nevertheless, these moments take into consideration only electrostatic interactions and ignore other important factors such as van der Waals interactions. The present study introduces a new class of 3D structure molecular descriptors for folded proteins named the stochastic van der Waals spectral moments ((o)beta(k)). Among many possible applications, recognition of kinases was selected due to the fact that previous computational chemistry studies in this area have not been reported, despite the widespread distribution of kinases. The best linear model found was Kact = -9.44 degrees beta(0)(c) +10.94 degrees beta(5)(c) -2.40 degrees beta(0)(i) + 2.45 degrees beta(5)(m) + 0.73, where core (c), inner (i) and middle (m) refer to specific spatial protein regions. The model with a high Matthew's regression coefficient (0.79) correctly classified 206 out of 230 proteins (89.6%) including both training and predicting series. An area under the ROC curve of 0.94 differentiates our model from a random classifier. A subsequent principal components analysis of 152 heterogeneous proteins demonstrated that beta(k) codifies information different to other descriptors used in protein computational chemistry studies. Finally, the model recognizes 110 out of 125 kinases (88.0%) in a virtual screening experiment and this can be considered as an additional validation study (these proteins were not used in training or predicting series).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Entropy
  • Markov Chains
  • Principal Component Analysis
  • Protein Conformation
  • Protein Folding
  • Protein Kinases / chemistry*
  • Protein Kinases / metabolism
  • Proteins / chemistry
  • Quantitative Structure-Activity Relationship*
  • ROC Curve
  • Static Electricity

Substances

  • Proteins
  • Protein Kinases