Delineation of protein structure classes from multivariate analysis of protein Raman optical activity data

J Mol Biol. 2006 Oct 13;363(1):19-26. doi: 10.1016/j.jmb.2006.08.038. Epub 2006 Aug 22.

Abstract

Vibrational Raman optical activity (ROA), measured as a small difference in the intensity of Raman scattering from chiral molecules in right and left-circularly polarized incident light, or as the intensity of a small circularly polarized component in the scattered light, is a powerful probe of the aqueous solution structure of proteins. On account of the large number of structure-sensitive bands in protein ROA spectra, multivariate analysis techniques such as non-linear mapping (NLM) are especially favourable for determining structural relationships between different proteins. Here NLM is used to map a dataset of 80 polypeptide, protein and virus ROA spectra, considered as points in a multidimensional space with axes representing the digitized wavenumbers, into readily visualizable two and three-dimensional spaces in which points close to or distant from each other, respectively, represent similar or dissimilar structures. Discrete clusters are observed which correspond to the seven structure classes all alpha, mainly alpha, alphabeta, mainly beta, all beta, mainly disordered/irregular and all disordered/irregular. The average standardised ROA spectra of the proteins falling within each structure class have distinct features characteristic of each class. A distinct cluster containing the wheat protein A-gliadin and the plant viruses potato virus X, narcissus mosaic virus, papaya mosaic virus and tobacco rattle virus, all of which appear in the mainly alpha cluster in the two-dimensional representation, becomes clearly separated in the direction of increasing disorder in the three-dimensional representation. This suggests that the corresponding five proteins, none of which to date has yielded high-resolution X-ray structures, consist mainly of alpha-helix and disordered structure with little or no beta-sheet. This combination of structural elements may have functional significance, such as facilitating disorder-to-order transitions (and vice versa) and suppressing aggregation, in these proteins and also in sequences within other proteins. The use of ROA to identify proteins containing significant amounts of disordered structure will, inter alia, be valuable in structural genomics/proteomics since disordered regions often inhibit crystallization.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Humans
  • Multivariate Analysis
  • Protein Folding
  • Proteins / chemistry*
  • Proteins / classification*
  • Spectrum Analysis, Raman*

Substances

  • Proteins