Sampling distribution of summary linkage disequilibrium measures

Ann Hum Genet. 2002 May;66(Pt 3):223-33. doi: 10.1017/S0003480002001082.

Abstract

The design of association studies is critically dependent upon the extent of linkage disequilibrium (LD) across different genomic regions, often summarised in terms of the mean absolute value of summary linkage disequilibrium measures. The two most commonly used measures are D' for estimating the magnitude or extent of LD, and Delta which is directly proportional to the power of LD mapping. We studied the sampling distribution of the mean of /D'/ and /Delta/ statistics for varying sample size and major allele frequencies. When the sample size is small or one allele frequency is extreme, estimates of the magnitude of association based on the mean of /D'/ can be substantially inflated. This inflation is more marked when the haplotype frequencies have been inferred from genotype counts. The net effect of this means that smaller studies will tend to show higher levels of LD. The magnitude of this inflation can be reduced by use of a bootstrap correction, and by avoiding using markers with extreme allele frequencies. In contrast, the /Delta/ statistic is much less affected by sample size and high major allele frequencies. These effects are illustrated with real data on 36 SNPs typed in an Ashkenasi Jewish population.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Linkage Disequilibrium*
  • Research Design*
  • Sampling Studies
  • Selection Bias