Statistical analysis of interface similarity in crystals of homologous proteins

J Mol Biol. 2008 Aug 29;381(2):487-507. doi: 10.1016/j.jmb.2008.06.002. Epub 2008 Jun 7.

Abstract

Many proteins function as homo-oligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homo-oligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using cross-linking and mass spectrometry, solution-state NMR, and other experiments. However, for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. Examination of interfaces across different Protein Data Bank (PDB) entries in a protein family reveals several important features. First, similarities in space group, asymmetric unit size, and cell dimensions and angles (within 1%) do not guarantee that two crystals are actually the same crystal form, containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers as well as larger structures and 132 monomers were used to determine whether the existence or lack of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, whereas higher-order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Finally, the Protein Interfaces, Surfaces, and Assemblies (PISA) database available from the European Bioinformatics Institute is more consistent in identifying interfaces observed in many crystal forms compared with the PDB and the European Bioinformatics Institute's Protein Quaternary Server (PQS). The PDB, in particular, is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Crystallization
  • Crystallography, X-Ray
  • Databases, Protein
  • Dimerization
  • Magnetic Resonance Spectroscopy
  • Models, Molecular
  • Models, Statistical*
  • Protein Binding
  • Protein Conformation
  • Protein Structure, Secondary
  • Proteins / chemistry*

Substances

  • Proteins