The domain structure and distribution of Alu elements in long noncoding RNAs and mRNAs

RNA. 2016 Feb;22(2):254-64. doi: 10.1261/rna.048280.114. Epub 2015 Dec 10.

Abstract

Approximately 75% of the human genome is transcribed and many of these spliced transcripts contain primate-specific Alu elements, the most abundant mobile element in the human genome. The majority of exonized Alu elements are located in long noncoding RNAs (lncRNAs) and the untranslated regions of mRNA, with some performing molecular functions. To further assess the potential for Alu elements to be repurposed as functional RNA domains, we investigated the distribution and evolution of Alu elements in spliced transcripts. Our analysis revealed that Alu elements are underrepresented in mRNAs and lncRNAs, suggesting that most exonized Alu elements arising in the population are rare or deleterious to RNA function. When mRNAs and lncRNAs retain exonized Alu elements, they have a clear preference for Alu dimers, left monomers, and right monomers. mRNAs often acquire Alu elements when their genes are duplicated within Alu-rich regions. In lncRNAs, reverse-oriented Alu elements are significantly enriched and are not restricted to the 3' and 5' ends. Both lncRNAs and mRNAs primarily contain the Alu J and S subfamilies that were amplified relatively early in primate evolution. Alu J subfamilies are typically overrepresented in lncRNAs, whereas the Alu S dimer is overrepresented in mRNAs. The sequences of Alu dimers tend to be constrained in both lncRNAs and mRNAs, whereas the left and right monomers are constrained within particular Alu subfamilies and classes of RNA. Collectively, these findings suggest that Alu-containing RNAs are capable of forming stable structures and that some of these Alu domains might have novel biological functions.

Keywords: Alu; RNA; lincRNA; lncRNA; noncoding.

MeSH terms

  • 3' Untranslated Regions*
  • 5' Untranslated Regions*
  • Alu Elements*
  • Computational Biology / methods
  • Evolution, Molecular
  • Exons
  • Genome, Human*
  • Humans
  • RNA, Long Noncoding / chemistry*
  • RNA, Long Noncoding / genetics

Substances

  • 3' Untranslated Regions
  • 5' Untranslated Regions
  • RNA, Long Noncoding