Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits

Nat Genet. 2023 May;55(5):768-776. doi: 10.1038/s41588-023-01379-x. Epub 2023 May 1.

Abstract

Genome-wide genealogies compactly represent the evolutionary history of a set of genomes and inferring them from genetic data has the potential to facilitate a wide range of analyses. We introduce a method, ARG-Needle, for accurately inferring biobank-scale genealogies from sequencing or genotyping array data, as well as strategies to utilize genealogies to perform association and other complex trait analyses. We use these methods to build genome-wide genealogies using genotyping data for 337,464 UK Biobank individuals and test for association across seven complex traits. Genealogy-based association detects more rare and ultra-rare signals (N = 134, frequency range 0.0007-0.1%) than genotype imputation using ~65,000 sequenced haplotypes (N = 64). In a subset of 138,039 exome sequencing samples, these associations strongly tag (average r = 0.72) underlying sequencing variants enriched (4.8×) for loss-of-function variation. These results demonstrate that inferred genome-wide genealogies may be leveraged in the analysis of complex traits, complementing approaches that require the availability of large, population-specific sequencing panels.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Specimen Banks
  • Genetics, Population*
  • Genome-Wide Association Study
  • Genotype
  • Humans
  • Multifactorial Inheritance* / genetics
  • Polymorphism, Single Nucleotide / genetics
  • Recombination, Genetic