Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome

PLoS Comput Biol. 2013;9(6):e1003087. doi: 10.1371/journal.pcbi.1003087. Epub 2013 Jun 6.

Abstract

The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cluster Analysis
  • Humans
  • Models, Theoretical
  • Protein Kinases / chemistry*
  • Proteome
  • Support Vector Machine

Substances

  • Proteome
  • Protein Kinases

Grants and funding

This work has been supported in part by NSF Graduate Research Fellowship grant DGE-0237081 to DHB, NSF ABI grant ABI-0960612, the John and Ann Doerr Fund for Computational Biomedicine at Rice University, and the Texas Higher Education Coordinating Board NHARP 01907. Equipment used to run the experiments presented in this paper is part of the Shared University Grid at Rice which is funded in part by NSF under Grant EIA-0216467, and a partnership between Rice University, Sun Microsystems, and Sigma Solutions, Inc. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.