A Pangenome Approach to Detect and Genotype TE Insertion Polymorphisms

Methods Mol Biol. 2023:2607:85-94. doi: 10.1007/978-1-0716-2883-6_5.

Abstract

Pangenome graphs are flexible data structures that contain the genetic variation that exists in a population of genomes and describe the sequences of the many possible ensuing haplotypes. Here, we use such a pangenome graph to represent and genotype transposable element (TE) polymorphisms. By combining the transposable element annotation (Alus, L1s, and SVAs) of the human genome reference with novel transposable element insertions observed in two high-quality assemblies (HG002 and HG00733), we show how to create a transposable element pangenome that consists of ~1.2 million reference and 2939 non-reference transposable elements. We then demonstrate this approach by aligning short-read sequencing data and genotyping transposable element deletions and insertions with reasonable specificity and sensitivity (0.85 F1-score).

Keywords: Genotyping; Pangenome graphs; Structural variation; Transposable elements.

MeSH terms

  • DNA Transposable Elements* / genetics
  • Genome, Human
  • Genotype
  • Haplotypes
  • Humans
  • Polymorphism, Genetic*

Substances

  • DNA Transposable Elements