Local assembly of long reads enables phylogenomics of transposable elements in a polyploid cell line

Nucleic Acids Res. 2022 Nov 28;50(21):e124. doi: 10.1093/nar/gkac794.

Abstract

Animal cell lines often undergo extreme genome restructuring events, including polyploidy and segmental aneuploidy that can impede de novo whole-genome assembly (WGA). In some species like Drosophila, cell lines also exhibit massive proliferation of transposable elements (TEs). To better understand the role of transposition during animal cell culture, we sequenced the genome of the tetraploid Drosophila S2R+ cell line using long-read and linked-read technologies. WGAs for S2R+ were highly fragmented and generated variable estimates of TE content across sequencing and assembly technologies. We therefore developed a novel WGA-independent bioinformatics method called TELR that identifies, locally assembles, and estimates allele frequency of TEs from long-read sequence data (https://github.com/bergmanlab/telr). Application of TELR to a ∼130x PacBio dataset for S2R+ revealed many haplotype-specific TE insertions that arose by transposition after initial cell line establishment and subsequent tetraploidization. Local assemblies from TELR also allowed phylogenetic analysis of paralogous TEs, which revealed that proliferation of TE families in vitro can be driven by single or multiple source lineages. Our work provides a model for the analysis of TEs in complex heterozygous or polyploid genomes that are recalcitrant to WGA and yields new insights into the mechanisms of genome evolution in animal cell culture.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cell Line
  • DNA Transposable Elements* / genetics
  • Drosophila / genetics
  • Phylogeny
  • Polyploidy*

Substances

  • DNA Transposable Elements