Evaluating the Adequacy of Molecular Clock Models Using Posterior Predictive Simulations

Mol Biol Evol. 2015 Nov;32(11):2986-95. doi: 10.1093/molbev/msv154. Epub 2015 Jul 10.

Abstract

Molecular clock models are commonly used to estimate evolutionary rates and timescales from nucleotide sequences. The goal of these models is to account for rate variation among lineages, such that they are assumed to be adequate descriptions of the processes that generated the data. A common approach for selecting a clock model for a data set of interest is to examine a set of candidates and to select the model that provides the best statistical fit. However, this can lead to unreliable estimates if all the candidate models are actually inadequate. For this reason, a method of evaluating absolute model performance is critical. We describe a method that uses posterior predictive simulations to assess the adequacy of clock models. We test the power of this approach using simulated data and find that the method is sensitive to bias in the estimates of branch lengths, which tends to occur when using underparameterized clock models. We also compare the performance of the multinomial test statistic, originally developed to assess the adequacy of substitution models, but find that it has low power in identifying the adequacy of clock models. We illustrate the performance of our method using empirical data sets from coronaviruses, simian immunodeficiency virus, killer whales, and marine turtles. Our results indicate that methods of investigating model adequacy, including the one proposed here, should be routinely used in combination with traditional model selection in evolutionary studies. This will reveal whether a broader range of clock models to be considered in phylogenetic analysis.

Keywords: Bayesian phylogenetics; evolutionary rates; model adequacy; model selection; molecular clock; posterior predictive simulations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bayes Theorem
  • Biological Evolution
  • Computer Simulation
  • Coronavirus / genetics
  • Evolution, Molecular
  • Genome, Mitochondrial
  • Models, Genetic*
  • Models, Molecular
  • Mutation Rate*
  • Phylogeny
  • Reproducibility of Results
  • Simian Immunodeficiency Virus / genetics
  • Whale, Killer / genetics