Algorithms for inferring haplotypes

Tianhua Niu

doi:10.1002/gepi.20024

Algorithms for inferring haplotypes

Genet Epidemiol. 2004 Dec;27(4):334-47. doi: 10.1002/gepi.20024.

Author

Tianhua Niu¹

Affiliation

¹ Division of Preventive Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02215, USA. tniu@rics.bwh.harvard.edu

PMID: 15368348
DOI: 10.1002/gepi.20024

Abstract

Haplotype phase information in diploid organisms provides valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to human diseases. Molecular haplotyping methods are labor-intensive, low-throughput, and very costly. Therefore, algorithms based on formal statistical theories were shown to be very effective and cost-efficient for haplotype reconstruction. This review covers 1) population-based haplotype inference methods: Clark's algorithm, expectation-maximization (EM) algorithm, coalescence-based algorithms (pseudo-Gibbs sampler and perfect/imperfect phylogeny), and partition-ligation algorithm implemented by a fully Bayesian model (Haplotyper) or by EM (PLEM); 2) family-based haplotype inference methods; 3) the handling of genotype scoring uncertainties (i.e., genotyping errors and raw two-dimensional genotype scatterplots) in inferring haplotypes; and 4) haplotype inference methods for pooled DNA samples. The advantages and limitations of each algorithm are discussed. By using simulations based on empirical data on the G6PD gene and TNFRSF5 gene, I demonstrate that different algorithms have different degrees of sensitivity to various extents of population diversities and genotyping error rates. Future development of statistical algorithms for addressing haplotype reconstruction will resort more and more to ideas based on combinatorial mathematics, graphical models, and machine learning, and they will have profound impacts on population genetics and genetic epidemiology with the advent of the human HapMap.

Publication types

Research Support, U.S. Gov't, P.H.S.
Review

MeSH terms

Algorithms*
Genetic Predisposition to Disease / epidemiology*
Genetics, Population*
Genotype
Haplotypes*
Humans
Models, Genetic
Polymorphism, Single Nucleotide

Grants and funding

R01 HG002518-01/HG/NHGRI NIH HHS/United States