A general method for the unbiased improvement of solution NMR structures by the use of related X-ray data, the AUREMOL-ISIC algorithm

BMC Struct Biol. 2006 Jun 26:6:14. doi: 10.1186/1472-6807-6-14.

Abstract

Background: Rapid and accurate three-dimensional structure determination of biological macromolecules is mandatory to keep up with the vast progress made in the identification of primary sequence information. During the last few years the amount of data deposited in the protein data bank has substantially increased providing additional information for novel structure determination projects. The key question is how to combine the available database information with the experimental data of the current project ensuring that only relevant information is used and a correct structural bias is produced. For this purpose a novel fully automated algorithm based on Bayesian reasoning has been developed. It allows the combination of structural information from different sources in a consistent way to obtain high quality structures with a limited set of experimental data. The new ISIC (Intelligent Structural Information Combination) algorithm is part of the larger AUREMOL software package.

Results: Our new approach was successfully tested on the improvement of the solution NMR structures of the Ras-binding domain of Byr2 from Schizosaccharomyces pombe, the Ras-binding domain of RalGDS from human calculated from a limited set of NMR data, and the immunoglobulin binding domain from protein G from Streptococcus by their corresponding X-ray structures. In all test cases clearly improved structures were obtained. The largest danger in using data from other sources is a possible bias towards the added structure. In the worst case instead of a refined target structure the structure from the additional source is essentially reproduced. We could clearly show that the ISIC algorithm treats these difficulties properly.

Conclusion: In summary, we present a novel fully automated method to combine strongly coupled knowledge from different sources. The combination with validation tools such as the calculation of NMR R-factors strengthens the impact of the method considerably since the improvement of the structures can be assessed quantitatively. The ISIC method can be applied to a large number of similar problems where the quality of the obtained three-dimensional structures is limited by the available experimental data like the improvement of large NMR structures calculated from sparse experimental data or the refinement of low resolution X-ray structures. Also structures may be refined using other available structural information such as homology models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / metabolism
  • Binding Sites
  • Crystallography, X-Ray* / methods
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • MAP Kinase Kinase Kinases / chemistry
  • MAP Kinase Kinase Kinases / metabolism
  • Nerve Tissue Proteins / chemistry
  • Nerve Tissue Proteins / metabolism
  • Nuclear Magnetic Resonance, Biomolecular* / methods
  • Protein Structure, Tertiary
  • Schizosaccharomyces / enzymology
  • Schizosaccharomyces pombe Proteins / chemistry
  • Schizosaccharomyces pombe Proteins / metabolism
  • Software*
  • Streptococcus
  • ral Guanine Nucleotide Exchange Factor / chemistry
  • ral Guanine Nucleotide Exchange Factor / metabolism
  • ras Proteins / chemistry
  • ras Proteins / metabolism

Substances

  • Bacterial Proteins
  • G-substrate
  • IgG Fc-binding protein, Streptococcus
  • Nerve Tissue Proteins
  • Schizosaccharomyces pombe Proteins
  • ral Guanine Nucleotide Exchange Factor
  • BYR2 protein, S pombe
  • MAP Kinase Kinase Kinases
  • ras Proteins