IChemPIC: A Random Forest Classifier of Biological and Crystallographic Protein-Protein Interfaces

J Chem Inf Model. 2015 Sep 28;55(9):2005-14. doi: 10.1021/acs.jcim.5b00190. Epub 2015 Sep 14.

Abstract

Protein-protein interactions are becoming a major focus of academic and pharmaceutical research to identify low molecular weight compounds able to modulate oligomeric signaling complexes. As the number of protein complexes of known three-dimensional structure is constantly increasing, there is a need to discard biologically irrelevant interfaces and prioritize those of high value for potential druggability assessment. A Random Forest model has been trained on a set of 300 protein-protein interfaces using 45 molecular interaction descriptors as input. It is able to predict the nature of external test interfaces (crystallographic vs biological) with accuracy at least equal to that of the best state-of-the-art methods. However, our method presents unique advantages in the early prioritization of potentially ligandable protein-protein interfaces: (i) it is equally robust in predicting either crystallographic or biological contacts and (ii) it can be applied to a wide array of oligomeric complexes ranging from small-sized biological interfaces to large crystallographic contacts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Crystallography, X-Ray
  • Databases, Protein*
  • Models, Biological*
  • Protein Conformation
  • Protein Interaction Mapping / instrumentation*
  • Proteins / chemistry*
  • Receptors, Interleukin-7 / chemistry

Substances

  • Proteins
  • Receptors, Interleukin-7