Construction of Practical Haplotype Graph (PHG) with the Whole-Genome Sequence Data

Methods Mol Biol. 2022:2443:273-284. doi: 10.1007/978-1-0716-2067-0_15.

Abstract

With the emerging sequencing technologies and cost reduction, the sequence data generation has accelerated from a single individual to multiple (thousands of) individuals of a species. The terabytes of sequence data generated from thousands of individuals include the majority of the redundant sequence which depends on the level of sequence similarity within the population of individuals. Managing large datasets and creating the unique catalogue sequence from such a large population is challenging to analyze, store, and retrieve the information. In this chapter, we discuss the practical haplotype graph (PHG) which addresses the above said challenges and also able to retrieve required information such as variants and sequences more efficiently, which enable researchers to manage and assess large genomic data.

Keywords: Database; Graph; PHG; Pangenome; Practical haplotype graph; Second-generation sequence; whole-genome sequence.

MeSH terms

  • Genome*
  • Genomics*
  • Haplotypes / genetics
  • Sequence Analysis, DNA