Connectivity independent protein-structure alignment: a hierarchical approach

BMC Bioinformatics. 2006 Nov 21:7:510. doi: 10.1186/1471-2105-7-510.

Abstract

Background: Protein-structure alignment is a fundamental tool to study protein function, evolution and model building. In the last decade several methods for structure alignment were introduced, but most of them ignore that structurally similar proteins can share the same spatial arrangement of secondary structure elements (SSE) but differ in the underlying polypeptide chain connectivity (non-sequential SSE connectivity).

Results: We perform protein-structure alignment using a two-level hierarchical approach implemented in the program GANGSTA. On the first level, pair contacts and relative orientations between SSEs (i.e. alpha-helices and beta-strands) are maximized with a genetic algorithm (GA). On the second level residue pair contacts from the best SSE alignments are optimized. We have tested the method on visually optimized structure alignments of protein pairs (pairwise mode) and for database scans. For a given protein structure, our method is able to detect significant structural similarity of functionally important folds with non-sequential SSE connectivity. The performance for structure alignments with strictly sequential SSE connectivity is comparable to that of other structure alignment methods.

Conclusion: As demonstrated for several applications, GANGSTA finds meaningful protein-structure alignments independent of the SSE connectivity. GANGSTA is able to detect structural similarity of protein folds that are assigned to different superfamilies but nevertheless possess similar structures and perform related functions, even if these proteins differ in SSE connectivity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Motifs
  • Computational Biology
  • Computer Graphics
  • Databases, Genetic
  • Models, Molecular
  • Protein Conformation
  • Protein Folding
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / genetics
  • Reproducibility of Results
  • Sequence Alignment*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteins