On the decidability of population size histories from finite allele frequency spectra

Theor Popul Biol. 2018 Mar:120:42-51. doi: 10.1016/j.tpb.2017.12.008. Epub 2018 Jan 3.

Abstract

Understanding the historical events that shaped current genomic diversity has applications in historical, biological, and medical research. However, the amount of historical information that can be inferred from genetic data is finite, which leads to an identifiability problem. For example, different historical processes can lead to identical distribution of allele frequencies. This identifiability issue casts a shadow of uncertainty over the results of any study which uses the frequency spectrum to infer past demography. It has been argued that imposing mild 'reasonableness' constraints on demographic histories can enable unique reconstruction, at least in an idealized setting where the length of the genome is nearly infinite. Here, we discuss this problem for finite sample size and genome length. Using the diffusion approximation, we obtain bounds on likelihood differences between similar demographic histories, and use them to construct pairs of very different reasonable histories that produce almost-identical frequency distributions. The finite-genome problem therefore remains poorly determined even among reasonable histories. Where fits to few-parameter models produce narrow parameter confidence intervals, large uncertainties lurk hidden by model assumption.

Keywords: Demographic inference; Diffusion; Frequency spectrum; Wright–Fisher.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Demography / methods*
  • Evolution, Molecular
  • Gene Frequency*
  • Genetic Variation
  • Genetics, Population*
  • Humans
  • Likelihood Functions
  • Models, Genetic*
  • Population Density*

Grants and funding