Probabilistic single-individual haplotyping

Bioinformatics. 2014 Sep 1;30(17):i379-85. doi: 10.1093/bioinformatics/btu484.

Abstract

Motivation: Accurate haplotyping-determining from which parent particular portions of the genome are inherited-is still mostly an unresolved problem in genomics. This problem has only recently started to become tractable, thanks to the development of new long read sequencing technologies. Here, we introduce ProbHap, a haplotyping algorithm targeted at such technologies. The main algorithmic idea of ProbHap is a new dynamic programming algorithm that exactly optimizes a likelihood function specified by a probabilistic graphical model and which generalizes a popular objective called the minimum error correction. In addition to being accurate, ProbHap also provides confidence scores at phased positions.

Results: On a standard benchmark dataset, ProbHap makes 11% fewer errors than current state-of-the-art methods. This accuracy can be further increased by excluding low-confidence positions, at the cost of a small drop in haplotype completeness.

Availability: Our source code is freely available at: https://github.com/kuleshov/ProbHap.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Genome, Human
  • Genomics
  • Haplotypes*
  • Humans
  • Likelihood Functions
  • Models, Statistical*
  • Sequence Analysis, DNA / methods*