AliGROOVE--visualization of heterogeneous sequence divergence within multiple sequence alignments and detection of inflated branch support

BMC Bioinformatics. 2014 Aug 30;15(1):294. doi: 10.1186/1471-2105-15-294.

Abstract

Background: Masking of multiple sequence alignment blocks has become a powerful method to enhance the tree-likeness of the underlying data. However, existing masking approaches are insensitive to heterogeneous sequence divergence which can mislead tree reconstructions. We present AliGROOVE, a new method based on a sliding window and a Monte Carlo resampling approach, that visualizes heterogeneous sequence divergence or alignment ambiguity related to single taxa or subsets of taxa within a multiple sequence alignment and tags suspicious branches on a given tree.

Results: We used simulated multiple sequence alignments to show that the extent of alignment ambiguity in pairwise sequence comparison is correlated with the frequency of misplaced taxa in tree reconstructions. The approach implemented in AliGROOVE allows to detect nodes within a tree that are supported despite the absence of phylogenetic signal in the underlying multiple sequence alignment. We show that AliGROOVE equally well detects heterogeneous sequence divergence in a case study based on an empirical data set of mitochondrial DNA sequences of chelicerates.

Conclusions: The AliGROOVE approach has the potential to identify single taxa or subsets of taxa which show predominantly randomized sequence similarity in comparison with other taxa in a multiple sequence alignment. It further allows to evaluate the reliability of node support in a novel way.

MeSH terms

  • Algorithms*
  • Animals
  • Arthropods / classification
  • Arthropods / genetics
  • Computational Biology / methods*
  • Computer Graphics*
  • DNA, Mitochondrial / genetics
  • Genetic Variation*
  • Monte Carlo Method
  • Phylogeny*
  • Reproducibility of Results
  • Sequence Alignment / methods*

Substances

  • DNA, Mitochondrial