Stationary distribution of the linkage disequilibrium coefficient r2

Theor Popul Biol. 2019 Aug:128:19-26. doi: 10.1016/j.tpb.2019.05.002. Epub 2019 May 27.

Abstract

The linkage disequilibrium coefficient r2 is a measure of statistical dependence of the alleles possessed by an individual at different genetic loci. It is widely used in association studies to search for the locations of disease-causing genes on chromosomes. Most studies to date treat r2 as a fixed property of two loci in a finite population, and investigate the sampling distribution of estimators due to the statistical sampling of individuals from the population. Here, we instead consider the distribution of r2 itself under a process of genetic sampling through the generations. Using a classical two-locus model for genetic drift, mutation, and recombination, we investigate the probability density function of r2 at stationarity. This density function provides a tool for inference on evolutionary parameters such as mutation and recombination rates. We reconstruct the approximate stationary density of r2 by calculating a finite sequence of the distribution's moments and applying the maximum entropy principle. Our approach is based on the diffusion approximation, under which we demonstrate that for certain models in population genetics, moments of the stationary distribution can be obtained without knowing the probability distribution itself. To illustrate our approach, we show how the stationary probability density of r2 can be used in a maximum likelihood framework to estimate mutation and recombination rates from sample data of r2.

Keywords: Diffusion approximation; Linkage disequilibrium; Maximum entropy principle; Maximum likelihood; Stationary distribution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alleles
  • Genetic Loci
  • Genetics, Population
  • Linkage Disequilibrium*
  • Models, Statistical*