Measuring guide-tree dependency of inferred gaps in progressive aligners

Bioinformatics. 2013 Apr 15;29(8):1011-7. doi: 10.1093/bioinformatics/btt095. Epub 2013 Feb 23.

Abstract

Motivation: Multiple sequence alignments are generally reconstructed using a progressive approach that follows a guide-tree. During this process, gaps are introduced at a cost to maximize residue pairing, but it is unclear whether inferred gaps reflect actual past events of sequence insertions or deletions. It has been found that patterns of inferred gaps in alignments contain information towards the true phylogeny, but it is as yet unknown whether gaps are simply reflecting information that was already present in the guide-tree.

Results: We here develop a framework to disentangle the phylogenetic signal carried by gaps from that which is already present in the guide-tree. Our results indicate that most gaps are incorrectly inserted in patterns that, nevertheless, follow the guide-tree. Thus, most gap patterns in current alignments are not informative per se. This affects different programs to various degrees, PRANK being the most sensitive to the guide-tree.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Phylogeny*
  • Sequence Alignment / methods*