Long branch effects distort maximum likelihood phylogenies in simulations despite selection of the correct model

PLoS One. 2012;7(5):e36593. doi: 10.1371/journal.pone.0036593. Epub 2012 May 9.

Abstract

The aim of our study was to test the robustness and efficiency of maximum likelihood with respect to different long branch effects on multiple-taxon trees. We simulated data of different alignment lengths under two different 11-taxon trees and a broad range of different branch length conditions. The data were analyzed with the true model parameters as well as with estimated and incorrect assumptions about among-site rate variation. If length differences between connected branches strongly increase, tree inference with the correct likelihood model assumptions can fail. We found that incorporating invariant sites together with Γ distributed site rates in the tree reconstruction (Γ+I) increases the robustness of maximum likelihood in comparison with models using only Γ. The results show that for some topologies and branch lengths the reconstruction success of maximum likelihood under the correct model is still low for alignments with a length of 100,000 base positions. Altogether, the high confidence that is put in maximum likelihood trees is not always justified under certain tree shapes even if alignment lengths reach 100,000 base positions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation*
  • Likelihood Functions
  • Models, Genetic*
  • Phylogeny*