Next-generation sequencing reveals new information about HLA allele and haplotype diversity in a large European American population

Hum Immunol. 2019 Oct;80(10):807-822. doi: 10.1016/j.humimm.2019.07.275. Epub 2019 Jul 22.

Abstract

The human leukocyte antigen (HLA) genes are extremely polymorphic and are useful molecular markers to make inferences about human population history. However, the accuracy of the estimation of genetic diversity at HLA loci very much depends on the technology used to characterize HLA alleles; high-resolution genotyping of long-range HLA gene products improves the assessment of HLA population diversity as well as other population parameters compared to lower resolution typing methods. In this study we examined allelic and haplotype HLA diversity in a large healthy European American population sourced from the UCSF-DNA bank. A high-resolution next-generation sequencing method was applied to define non-ambiguous 3- and 4-field alleles at the HLA-A, HLA-C, HLA-B, HLA-DRB1, HLA-DRB3/4/5, HLA-DQA1, HLA-DQB1, HLA-DPA1, and HLA-DPB1 loci in samples provided by 2248 unrelated individuals. A number of population parameters were examined including balancing selection and various measurements of linkage disequilibrium were calculated. There were no detectable deviations from Hardy-Weinberg proportions at HLA-A, HLA-DRB1, HLA-DQA1 and HLA-DQB1. For the remaining loci moderate and significant deviations were detected at HLA-C, HLA-B, HLA-DRB3/4/5, HLA-DPA1 and HLA-DPB1 loci mostly from population substructures. Unique 4-field associations were observed among alleles at 2 loci and haplotypes extending large intervals that were not apparent in results obtained using testing methodologies with limited sequence coverage and phasing. The high diversity at HLA-DPA1 results from detection of intron variants of otherwise well conserved protein sequences. It may be speculated that divergence in exon sequences may be negatively selected. Our data provides a valuable reference source for future population studies that may allow for precise fine mapping of coding and non-coding sequences determining disease susceptibility and allo-immunogenicity.

Keywords: Allele frequency; European Americans; Haplotype blocks; Haplotype frequency; Human leukocyte antigen; Linkage disequilibrium; Next-generation sequencing; Population genetics.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Alleles
  • Cohort Studies
  • Europe / ethnology
  • Female
  • Gene Frequency / genetics*
  • Genetic Loci / genetics
  • Genetics, Population / methods*
  • HLA Antigens / genetics*
  • Haplotypes / genetics*
  • High-Throughput Nucleotide Sequencing*
  • Histocompatibility Testing
  • Humans
  • Linkage Disequilibrium / genetics
  • Male
  • Middle Aged
  • United States
  • White People / ethnology
  • White People / genetics*
  • Young Adult

Substances

  • HLA Antigens