Mid-pass whole genome sequencing enables biomedical genetic studies of diverse populations

BMC Genomics. 2021 Nov 1;22(1):666. doi: 10.1186/s12864-021-07949-9.

Abstract

Background: Historically, geneticists have relied on genotyping arrays and imputation to study human genetic variation. However, an underrepresentation of diverse populations has resulted in arrays that poorly capture global genetic variation, and a lack of reference panels. This has contributed to deepening global health disparities. Whole genome sequencing (WGS) better captures genetic variation but remains prohibitively expensive. Thus, we explored WGS at "mid-pass" 1-7x coverage.

Results: Here, we developed and benchmarked methods for mid-pass sequencing. When applied to a population without an existing genomic reference panel, 4x mid-pass performed consistently well across ethnicities, with high recall (98%) and precision (97.5%).

Conclusion: Compared to array data imputed into 1000 Genomes, mid-pass performed better across all metrics and identified novel population-specific variants with potential disease relevance. We hope our work will reduce financial barriers for geneticists from underrepresented populations to characterize their genomes prior to biomedical genetic applications.

MeSH terms

  • Genome
  • Genome, Human
  • Genome-Wide Association Study*
  • Genomics
  • Genotype
  • Humans
  • Polymorphism, Single Nucleotide*
  • Whole Genome Sequencing