Statistical measures of uncertainty for branches in phylogenetic trees inferred from molecular sequences by using model-based methods

Borys Wróbel

doi:10.1007/BF03195249

Statistical measures of uncertainty for branches in phylogenetic trees inferred from molecular sequences by using model-based methods

J Appl Genet. 2008;49(1):49-67. doi: 10.1007/BF03195249.

Author

Borys Wróbel¹

Affiliation

¹ Department of Marine Genetics and Biotechnology, Institute of Oceanology, Polish Academy of Sciences, Powstanców Warszawy 55, 81-712 Sopot, Poland. bwrobel@iopan.gda.pl

PMID: 18263970
DOI: 10.1007/BF03195249

Abstract

In recent years, the emphasis of theoretical work on phylogenetic inference has shifted from the development of new tree inference methods to the development of methods to measure the statistical support for the topologies. This paper reviews 3 approaches to assign support values to branches in trees obtained in the analysis of molecular sequences: the bootstrap, the Bayesian posterior probabilities for clades, and the interior branch tests. In some circumstances, these methods give different answers. It should not be surprising: their assumptions are different. Thus the interior branch tests assume that a given topology is true and only consider if a particular branch length is longer than zero. If a tree is incorrect, a wrong branch (a low bootstrap or Bayesian support may be an indication) may have a non-zero length. If the substitution model is oversimplified, the length of a branch may be overestimated, and the Bayesian support for the branch may be inflated. The bootstrap, on the other hand, approximates the variance of the data under the real model of sequence evolution, because it involves direct resampling from this data. Thus the discrepancy between the Bayesian support and the bootstrap support may signal model inaccuracy. In practical application, use of all 3 methods is recommended, and if discrepancies are observed, then a careful analysis of their potential origins should be made.

Publication types

Research Support, Non-U.S. Gov't
Review

MeSH terms

Animals
Bayes Theorem
Computers, Molecular / statistics & numerical data
Computers, Molecular / trends
Humans
Models, Genetic*
Phylogeny*
Sequence Analysis, DNA / methods*
Sequence Analysis, DNA / statistics & numerical data*
Sequence Analysis, DNA / trends
Sequence Analysis, Protein / methods*
Sequence Analysis, Protein / statistics & numerical data*
Sequence Analysis, Protein / trends
Uncertainty*