Elevated substitution rate estimates from ancient DNA: model violation and bias of Bayesian methods

Mol Ecol. 2009 Nov;18(21):4390-7. doi: 10.1111/j.1365-294X.2009.04333.x. Epub 2009 Sep 7.

Abstract

The increasing ability to extract and sequence DNA from noncontemporaneous tissue offers biologists the opportunity to analyse ancient DNA (aDNA) together with modern DNA (mDNA) to address the taxonomy of extinct species, evolutionary origins, historical phylogeography and biogeography. Perhaps more exciting are recent developments in coalescence-based Bayesian inference that offer the potential to use temporal information from aDNA and mDNA for the estimation of substitution rates and divergence dates as an alternative to fossil and geological calibration. This comes at a time of growing interest in the possibility of time dependency for molecular rate estimates. In this study, we provide a critical assessment of Bayesian Markov chain Monte Carlo (MCMC) analysis for the estimation of substitution rate using simulated samples of aDNA and mDNA. We conclude that the current models and priors employed in Bayesian MCMC analysis of heterochronous mtDNA are susceptible to an upward bias in the estimation of substitution rates because of model misspecification when the data come from populations with less than simple demographic histories, including sudden short-lived population bottlenecks or pronounced population structure. However, when model misspecification is only mild, then the 95% highest posterior density intervals provide adequate frequentist coverage of the true rates.

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Evolution, Molecular*
  • Gene Flow
  • Genetics, Population*
  • Markov Chains
  • Models, Genetic*
  • Monte Carlo Method
  • Phylogeny*
  • Sequence Analysis, DNA