Distinguishing crystallographic from biological interfaces in protein complexes: role of intermolecular contacts and energetics for classification

BMC Bioinformatics. 2018 Nov 30;19(Suppl 15):438. doi: 10.1186/s12859-018-2414-9.

Abstract

Background: Study of macromolecular assemblies is fundamental to understand functions in cells. X-ray crystallography is the most common technique to solve their 3D structure at atomic resolution. In a crystal, however, both biologically-relevant interfaces and non-specific interfaces resulting from crystallographic packing are observed. Due to the complexity of the biological assemblies currently tackled, classifying those interfaces, i.e. distinguishing biological from crystal lattice interfaces, is not trivial and often prone to errors. In this context, analyzing the physico-chemical characteristics of biological/crystal interfaces can help researchers identify possible features that distinguish them and gain a better understanding of the systems.

Results: In this work, we are providing new insights into the differences between biological and crystallographic complexes by focusing on "pair-properties" of interfaces that have not yet been fully investigated. We investigated properties such intermolecular residue-residue contacts (already successfully applied to the prediction of binding affinities) and interaction energies (electrostatic, Van der Waals and desolvation). By using the XtalMany and BioMany interface datasets, we show that interfacial residue contacts, classified as a function of their physico-chemical properties, can distinguish between biological and crystallographic interfaces. The energetic terms show, on average, higher values for crystal interfaces, reflecting a less stable interface due to crystal packing compared to biological interfaces. By using a variety of machine learning approaches, we trained a new interface classification predictor based on contacts and interaction energetic features. Our predictor reaches an accuracy in classifying biological vs crystal interfaces of 0.92, compared to 0.88 for EPPIC (one of the main state-of-the-art classifiers reporting same performance as PISA).

Conclusion: In this work we have gained insights into the nature of intermolecular contacts and energetics terms distinguishing biological from crystallographic interfaces. Our findings might have a broader applicability in structural biology, for example for the identification of near native poses in docking. We implemented our classification approach into an easy-to-use and fast software, freely available to the scientific community from http://github.com/haddocking/interface-classifier .

Keywords: Biological interface; Classification; Crystal interfaces; EPPIC; Intermolecular contacts; PISA; Predictor; Protein-protein interface; Residue contacts.

MeSH terms

  • Algorithms
  • Crystallography, X-Ray
  • Databases, Protein
  • Energy Metabolism*
  • Machine Learning
  • Proteins / chemistry*
  • Reproducibility of Results
  • Static Electricity

Substances

  • Proteins