Whole-genome expression analysis of mammalian-wide interspersed repeat elements in human cell lines

DNA Res. 2017 Feb 1;24(1):59-69. doi: 10.1093/dnares/dsw048.

Abstract

With more than 500,000 copies, mammalian-wide interspersed repeats (MIRs), a sub-group of SINEs, represent ∼2.5% of the human genome and one of the most numerous family of potential targets for the RNA polymerase (Pol) III transcription machinery. Since MIR elements ceased to amplify ∼130 myr ago, previous studies primarily focused on their genomic impact, while the issue of their expression has not been extensively addressed. We applied a dedicated bioinformatic pipeline to ENCODE RNA-Seq datasets of seven human cell lines and, for the first time, we were able to define the Pol III-driven MIR transcriptome at single-locus resolution. While the majority of Pol III-transcribed MIR elements are cell-specific, we discovered a small set of ubiquitously transcribed MIRs mapping within Pol II-transcribed genes in antisense orientation that could influence the expression of the overlapping gene. We also identified novel Pol III-transcribed ncRNAs, deriving from transcription of annotated MIR fragments flanked by unique MIR-unrelated sequences, and confirmed the role of Pol III-specific internal promoter elements in MIR transcription. Besides demonstrating widespread transcription at these retrotranspositionally inactive elements in human cells, the ability to profile MIR expression at single-locus resolution will facilitate their study in different cell types and states including pathological alterations.

Keywords: ENCODE; RNA polymerase III; RNA-Seq; SINE; mammalian-wide interspersed repeats.

MeSH terms

  • Computational Biology
  • Gene Expression Profiling
  • HeLa Cells
  • Humans
  • Interspersed Repetitive Sequences*
  • Plasmids
  • Retroelements
  • Sequence Analysis, RNA
  • Transcription, Genetic

Substances

  • Retroelements