Characterization of protein fold by wide-angle X-ray solution scattering

J Mol Biol. 2008 Nov 14;383(3):731-44. doi: 10.1016/j.jmb.2008.08.038. Epub 2008 Aug 23.

Abstract

Wide-angle X-ray solution scattering (WAXS) patterns contain substantial information about the three-dimensional structure of a protein. Although WAXS data have far less information than is required for determination of a full three-dimensional structure, the actual amount of information contained in a WAXS pattern has not been carefully quantified. Here we carry out an analysis of the amount of information that can be extracted from a WAXS pattern and demonstrate that it is adequate to estimate the secondary-structure content of a protein and to strongly limit its possible tertiary structures. WAXS patterns computed from the atomic coordinates of a set of 498 protein domains representing all of known fold space were used as the basis for constructing a multidimensional space of all corresponding WAXS patterns ('WAXS space'). Within WAXS space, each scattering pattern is represented by a single vector. A principal components analysis was carried out to identify those directions in WAXS space that provide the greatest discrimination among patterns. The number of dimensions that provide significant discrimination among protein folds agrees well with the number of independent parameters estimated from a naïve Shannon sampling theorem approach. Estimates of the relative abundances of secondary structures were made using training/test sets derived from this data set. The average error in the estimate of alpha-helical content was 11%, and of beta-sheet content was 9%. The distribution of proteins that are members of the four structure classes, alpha, beta, alpha/beta and alpha+beta, are well separated in WAXS space when data extending to a spacing of 2.2 A are used. Quantification of the information embedded within a WAXS pattern indicates that these data can be used as a powerful constraint in homology modeling of protein structures.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Crystallography, X-Ray / methods*
  • Databases, Factual
  • Mathematics
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Folding*
  • Protein Structure, Secondary*
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Software
  • Solutions / chemistry

Substances

  • Proteins
  • Solutions