Phylogenetic inference using whole genomes

Annu Rev Genomics Hum Genet. 2008:9:217-31. doi: 10.1146/annurev.genom.9.081307.164407.

Abstract

The availability of genome-wide data provides unprecedented opportunities for resolving difficult phylogenetic relationships and for studying population genetic processes of mutation, selection, and recombination on a genomic scale. The use of appropriate statistical models becomes increasingly important when we are faced with very large datasets, which can lead to improved precision but not necessarily improved accuracy if the analytical methods have systematic biases. This review provides a critical examination of methods for analyzing genomic datasets from multiple loci, including concatenation, separate gene-by-gene analyses, and statistical models that accommodate heterogeneity in different aspects of the evolutionary process among data partitions. We discuss factors that may cause the gene tree to differ from the species tree, as well as strategies for estimating species phylogenies in the presence of gene tree conflicts. Genomic datasets provide computational and statistical challenges that are likely to be a focus of research for years to come.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Data Interpretation, Statistical
  • Databases, Genetic / statistics & numerical data
  • Gene Transfer, Horizontal
  • Genome
  • Genomics / statistics & numerical data*
  • Humans
  • Phylogeny*
  • Polymorphism, Genetic
  • Recombination, Genetic
  • Sequence Alignment / statistics & numerical data
  • Software