The reliability of haplotyping inference in nuclear families: misassignment rates for SNPs and microsatellites

Hum Hered. 2004;57(3):117-27. doi: 10.1159/000079242.

Abstract

Single nucleotide polymorphisms (SNPs) are widely used when investigators try to map complex disease genes. Although biallelic SNP markers are less informative than microsatellite markers, one can increase their information content by using haplotypes. However, assigning haplotypes (i.e., assigning phase) correctly can be problematic in the presence of SNP heterozygosity. For example, a doubly heterozygous individual, with genotype 12, 12, could have haplotypes 1-1/2-2 or 1-2/2-1 with equal probability; in the absence of additional information, there is no way to determine which haplotype is correct. Thus an algorithm that assigns haplotypes to such an individual will assign the wrong one 50% of the time. We have studied the frequency of haplotype misassignments, i.e., haplotypes that are misassigned solely because of inherent marker ambiguity (not because of errors in genotyping or calculation). We examined both SNPs and microsatellite markers. We used the computer programs GENEHUNTER and SIMWALK to assign the haplotypes. We simulated (a) families with 1-5 children, (b) haplotypes involving different numbers of marker loci (3, 5, 7 and 10 loci, all in linkage equilibrium), and (c) different allele frequencies. Misassignment rates are highest (a) in small families, (b) with many SNP loci, and (c) for loci with the greatest heterozygosity (i.e., where both alleles have frequency 0.5). For example, for triads (i.e., one-child families with both parents genotyped), misassignment rates for SNPs can reach almost 50%. Family sizes of 4-5 children are required in order to ensure a misassignment frequency of < or = 5% for ten-SNP haplotypes with allele frequencies of 0.25-0.5. For microsatellites, a family size of at least 2-3 children is necessary to keep haplotyping misassignments < or = 5%. Finally, we point out that it is misleading for a computer program to yield haplotype assignments without indicating that they may have been misassigned, and we discuss the implications of these misassignments for association and linkage analysis.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Chromosome Mapping / methods*
  • Evaluation Studies as Topic
  • Gene Frequency
  • Genetic Linkage
  • Haplotypes / genetics*
  • Heterozygote*
  • Humans
  • Microsatellite Repeats / genetics
  • Pedigree
  • Polymorphism, Single Nucleotide / genetics*
  • Research Design
  • Software*