Hierarchical clustering based upon contextual alignment of proteins: a different way to approach phylogeny

C R Biol. 2005 Jan;328(1):11-22. doi: 10.1016/j.crvi.2004.11.001.

Abstract

We perform a computational study using a new approach to the analysis of protein sequences. The contextual alignment model, proposed recently by Gambin et al. (2002), is based on the assumption that, while constructing an alignment, the score of a substitution of one residue by another depends on the surrounding residues. The contextual alignment scores calculated in this model were used to hierarchical clustering of several protein families from the database of Clusters of Orthologous Groups (COG). The clustering has been also constructed based on the standard approach. The comparative analysis shows that the contextual model results in more consistent clustering trees. The difference, although small, is with no exception in favour of the contextual model. The consistency of the family of trees is measured by several consensus and agreement methods, as well as by the inter-tree distance approach.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • DNA / genetics
  • Decision Trees
  • Enzymes / genetics
  • Phylogeny*
  • Proteins / classification*
  • Proteins / genetics*
  • Sequence Alignment

Substances

  • Enzymes
  • Proteins
  • DNA