Estimating linkage disequilibrium and selection from allele frequency trajectories

Genetics. 2023 Mar 2;223(3):iyac189. doi: 10.1093/genetics/iyac189.

Abstract

Genetic sequences collected over time provide an exciting opportunity to study natural selection. In such studies, it is important to account for linkage disequilibrium to accurately measure selection and to distinguish between selection and other effects that can cause changes in allele frequencies, such as genetic hitchhiking or clonal interference. However, most high-throughput sequencing methods cannot directly measure linkage due to short-read lengths. Here we develop a simple method to estimate linkage disequilibrium from time-series allele frequencies. This reconstructed linkage information can then be combined with other inference methods to infer the fitness effects of individual mutations. Simulations show that our approach reliably outperforms inference that ignores linkage disequilibrium and, with sufficient sampling, performs similarly to inference using the true linkage information. We also introduce two regularization methods derived from random matrix theory that help to preserve its performance under limited sampling effects. Overall, our method enables the use of linkage-aware inference methods even for data sets where only allele frequency time series are available.

Keywords: allele frequency time series; covariance estimation; genetic linkage; selection coefficients; short-read data; statistical inference.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Gene Frequency
  • High-Throughput Nucleotide Sequencing*
  • Linkage Disequilibrium
  • Mutation
  • Selection, Genetic*