Identifying sequence regions undergoing conformational change via predicted continuum secondary structure

Bioinformatics. 2006 Aug 1;22(15):1809-14. doi: 10.1093/bioinformatics/btl198. Epub 2006 May 23.

Abstract

Motivation: Conformational flexibility is essential to the function of many proteins, e.g. catalytic activity. To assist efforts in determining and exploring the functional properties of a protein, it is desirable to automatically identify regions that are prone to undergo conformational changes. It was recently shown that a probabilistic predictor of continuum secondary structure is more accurate than categorical predictors for structurally ambivalent sequence regions, suggesting that such models are suited to characterize protein flexibility.

Results: We develop a computational method for identifying regions that are prone to conformational change directly from the amino acid sequence. The method uses the entropy of the probabilistic output of an 8-class continuum secondary structure predictor. Results for 171 unique amino acid sequences with well-characterized variable structure (identified in the 'Macromolecular movements database') indicate that the method is highly sensitive at identifying flexible protein regions, but false positives remain a problem. The method can be used to explore conformational flexibility of proteins (including hypothetical or synthetic ones) whose structure is yet to be determined experimentally.

Availability: The predictor, sequence data and supplementary studies are available at http://pprowler.itee.uq.edu.au/sspred/ and are free for academic use.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computer Simulation
  • Models, Chemical*
  • Models, Molecular*
  • Models, Statistical
  • Molecular Sequence Data
  • Protein Conformation
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Structure-Activity Relationship

Substances

  • Proteins