Invariant Versus Classical Quartet Inference When Evolution is Heterogeneous Across Sites and Lineages

Syst Biol. 2016 Mar;65(2):280-91. doi: 10.1093/sysbio/syv086. Epub 2015 Nov 11.

Abstract

One reason why classical phylogenetic reconstruction methods fail to correctly infer the underlying topology is because they assume oversimplified models. In this article, we propose a quartet reconstruction method consistent with the most general Markov model of nucleotide substitution, which can also deal with data coming from mixtures on the same topology. Our proposed method uses phylogenetic invariants and provides a system of weights that can be used as input for quartet-based methods. We study its performance on real data and on a wide range of simulated 4-taxon data (both time-homogeneous and nonhomogeneous, with or without among-site rate heterogeneity, and with different branch length settings). We compare it to the classical methods of neighbor-joining (with paralinear distance), maximum likelihood (with different underlying models), and maximum parsimony. Our results show that this method is accurate and robust, has a similar performance to maximum likelihood when data satisfies the assumptions of both methods, and outperform the other methods when these are based on inappropriate substitution models. If alignments are long enough, then it also outperforms other methods when some of its assumptions are violated.

Keywords: General Markov model; heterogeneity across lineages; heterogeneity across sites; phylogenetic invariants; topology reconstruction; yeast.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Evolution
  • Candida albicans / classification
  • Candida albicans / genetics
  • Classification / methods*
  • Computer Simulation
  • Models, Biological*
  • Phylogeny*
  • Saccharomyces / classification
  • Saccharomyces / genetics