Population structure with localized haplotype clusters

Genetics. 2010 Aug;185(4):1337-44. doi: 10.1534/genetics.110.116681. Epub 2010 May 10.

Abstract

We propose a multilocus version of F(ST) and a measure of haplotype diversity using localized haplotype clusters. Specifically, we use haplotype clusters identified with BEAGLE, which is a program implementing a hidden Markov model for localized haplotype clustering and performing several functions including inference of haplotype phase. We apply this methodology to HapMap phase 3 data. With this haplotype-cluster approach, African populations have highest diversity and lowest divergence from the ancestral population, East Asian populations have lowest diversity and highest divergence, and other populations (European, Indian, and Mexican) have intermediate levels of diversity and divergence. These relationships accord with expectation based on other studies and accepted models of human history. In contrast, the population-specific F(ST) estimates obtained directly from single-nucleotide polymorphisms (SNPs) do not reflect such expected relationships. We show that ascertainment bias of SNPs has less impact on the proposed haplotype-cluster-based F(ST) than on the SNP-based version, which provides a potential explanation for these results. Thus, these new measures of F(ST) and haplotype-cluster diversity provide an important new tool for population genetic analysis of high-density SNP data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Africa
  • Algorithms
  • Asia
  • Chromosomes, Human, Pair 22 / genetics
  • Cluster Analysis
  • Europe
  • Genetic Variation*
  • Genetics, Population / methods*
  • Haplotypes / genetics*
  • Humans
  • Markov Chains
  • Mexico
  • Polymorphism, Single Nucleotide*
  • Software