Bayesian Analysis of Evolutionary Divergence with Genomic Data under Diverse Demographic Models

Mol Biol Evol. 2017 Jun 1;34(6):1517-1528. doi: 10.1093/molbev/msx070.

Abstract

We present a new Bayesian method for estimating demographic and phylogenetic history using population genomic data. Several key innovations are introduced that allow the study of diverse models within an Isolation-with-Migration framework. The new method implements a 2-step analysis, with an initial Markov chain Monte Carlo (MCMC) phase that samples simple coalescent trees, followed by the calculation of the joint posterior density for the parameters of a demographic model. In step 1, the MCMC sampling phase, the method uses a reduced state space, consisting of coalescent trees without migration paths, and a simple importance sampling distribution without the demography of interest. Once obtained, a single sample of trees can be used in step 2 to calculate the joint posterior density for model parameters under multiple diverse demographic models, without having to repeat MCMC runs. Because migration paths are not included in the state space of the MCMC phase, but rather are handled by analytic integration in step 2 of the analysis, the method is scalable to a large number of loci with excellent MCMC mixing properties. With an implementation of the new method in the computer program MIST, we demonstrate the method's accuracy, scalability, and other advantages using simulated data and DNA sequences of two common chimpanzee subspecies: Pan troglodytes (P. t.) troglodytes and P. t. verus.

Keywords: Markov chain representation; importance sampling; isolation-with-migration model; likelihood ratio test; model comparison.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem*
  • Biological Evolution
  • Demography
  • Evolution, Molecular
  • Genetic Variation / genetics
  • Genomics / methods*
  • Markov Chains
  • Models, Genetic
  • Monte Carlo Method
  • Phylogeny
  • Software