RepExpress: A Novel Pipeline for the Quantification and Characterization of Transposable Element Expression from RNA-seq Data

Curr Protoc. 2021 Aug;1(8):e206. doi: 10.1002/cpz1.206.

Abstract

Transposable elements (TEs) are key regulators of both development and disease; however, their repetitive nature presents substantial computational challenges to their analysis. Due to a lack of computational tools and suitable analysis frameworks, TE expression is often not quantified at the locus level. Therefore, we have developed RepExpress, a novel pipeline that enables locus-level TE quantification and characterization. RepExpress enables the characterization of TE expression in a genomic context, and is the first tool focusing on the identification of tissue-specific TE-derived and TE-regulated genes. RepExpress identifies expressed TEs overlapping with annotated genomic features and enables tissue-specific profiles of TE-derived genes. TEs that are expressed with no overlap with any known genomic features are characterized by the closest downstream genomic feature enabling identification of novel TE-gene regulatory relationships. RepExpress takes standard RNA-seq data as input and performs genomic alignment optimized for TEs. Our novel pipeline quantifies expression of both TEs and genes using featureCounts and Stringtie, respectively. RepExpress then filters expressed repeats and characterizes their genomic context, enabling the identification of TEs that overlap with genes, or that may be influencing gene expression. Here, we describe RepExpress, and provide a step-by-step protocol detailing its workflow. We also discuss other TE analysis tools and their applicability to addressing different biological questions. © 2021 Wiley Periodicals LLC. Basic Protocol: RepExpress workflow.

Keywords: RNA sequencing; RNA-seq transposable elements; TE expression; TE-derived genes; TE-regulated genes; TEs.

MeSH terms

  • DNA Transposable Elements* / genetics
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Genomics*
  • RNA-Seq

Substances

  • DNA Transposable Elements