Comprehensive description of protein structures using protein folding shape code

Proteins. 2008 May 15;71(3):1497-518. doi: 10.1002/prot.21932.

Abstract

Understanding and describing three-dimensional (3D) protein structures have dominated biological and biochemistry research for many years. A comprehensive description of protein folding structure is essential for the advancement of protein research. In this study, a novel description method is developed to generate a set of folding patterns with specific shape features, as well as vector characteristics in space. To accomplish the goal, this method embeds features from geometry, morphology and topology together into an algorithmic approach to achieve a full description for proteins. A set of 27 vectors is derived mathematically from an enclosed space, and each vector represents a 3D folding shape of five successive C(alpha) atoms in the protein backbone. The 27 vectors are represented by 27 symbols, which are called as the protein folding shape code (PFSC). The PFSC method offers a digital description of folding shapes along a protein backbone, which facilitates protein structure analysis. The PFSC method provides a tool to study the similarity and dissimilarity for protein or protein conformers. The PFSC results show overall agreement with structural assignments from the protein data bank, as well as results from other methods. All results show that the PFSC method is a reliable tool with explicit meaning for protein folding shape description.

Publication types

  • Comparative Study

MeSH terms

  • Amino Acid Sequence
  • Molecular Sequence Data
  • Peptide Fragments / chemistry
  • Protein Conformation
  • Protein Folding*
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Structural Homology, Protein

Substances

  • Peptide Fragments
  • Proteins