Efficient Bayesian inference under the multispecies coalescent with migration

Proc Natl Acad Sci U S A. 2023 Oct 31;120(44):e2310708120. doi: 10.1073/pnas.2310708120. Epub 2023 Oct 23.

Abstract

Analyses of genome sequence data have revealed pervasive interspecific gene flow and enriched our understanding of the role of gene flow in speciation and adaptation. Inference of gene flow using genomic data requires powerful statistical methods. Yet current likelihood-based methods involve heavy computation and are feasible for small datasets only. Here, we implement the multispecies-coalescent-with-migration model in the Bayesian program bpp, which can be used to test for gene flow and estimate migration rates, as well as species divergence times and population sizes. We develop Markov chain Monte Carlo algorithms for efficient sampling from the posterior, enabling the analysis of genome-scale datasets with thousands of loci. Implementation of both introgression and migration models in the same program allows us to test whether gene flow occurred continuously over time or in pulses. Analyses of genomic data from Anopheles mosquitoes demonstrate rich information in typical genomic datasets about the mode and rate of gene flow.

Keywords: BPP; gene flow; genomics; migration; multispecies coalescent.

MeSH terms

  • Algorithms*
  • Animals
  • Bayes Theorem
  • Computer Simulation
  • Gene Flow*
  • Likelihood Functions
  • Models, Genetic
  • Phylogeny