Haplotype threading: accurate polyploid phasing from long reads

Genome Biol. 2020 Sep 21;21(1):252. doi: 10.1186/s13059-020-02158-1.

Abstract

Resolving genomes at haplotype level is crucial for understanding the evolutionary history of polyploid species and for designing advanced breeding strategies. Polyploid phasing still presents considerable challenges, especially in regions of collapsing haplotypes.We present WHATSHAP POLYPHASE, a novel two-stage approach that addresses these challenges by (i) clustering reads and (ii) threading the haplotypes through the clusters. Our method outperforms the state-of-the-art in terms of phasing quality. Using a real tetraploid potato dataset, we demonstrate how to assemble local genomic regions of interest at the haplotype level. Our algorithm is implemented as part of the widely used open source tool WhatsHap.

Keywords: Cluster editing; Haplotypes; High-throughput nucleotide sequencing; Phasing; Plant science; Polyploidy; Sequence analysis.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Haplotypes*
  • Models, Genetic*
  • Polyploidy*
  • Solanum tuberosum / genetics