Haplotype inference by Pure Parsimony: a survey

J Comput Biol. 2010 Aug;17(8):969-92. doi: 10.1089/cmb.2009.0101.

Abstract

Given a set of genotypes from a population, the process of recovering the haplotypes that explain the genotypes is called haplotype inference. The haplotype inference problem under the assumption of pure parsimony consists in finding the smallest number of haplotypes that explain a given set of genotypes. This problem is NP-hard. The original formulations for solving the Haplotype Inference by Pure Parsimony (HIPP) problem were based on integer linear programming and branch-and-bound techniques. More recently, solutions based on Boolean satisfiability, pseudo-Boolean optimization, and answer set programming have been shown to be remarkably more efficient. HIPP can now be regarded as a feasible approach for haplotype inference, which can be competitive with other different approaches. This article provides an overview of the methods for solving the HIPP problem, including preprocessing, bounding techniques, and heuristic approaches. The article also presents an empirical evaluation of exact HIPP solvers on a comprehensive set of synthetic and real problem instances. Moreover, the bounding techniques to the exact problem are evaluated. The final section compares and discusses the HIPP approach with a well-established statistical method that represents the reference algorithm for this problem.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Animals
  • Genotype
  • Haplotypes*
  • Humans
  • Models, Genetic*