The folding of 5'-UTR human G-quadruplexes possessing a long central loop

RNA. 2014 Jul;20(7):1129-41. doi: 10.1261/rna.044578.114. Epub 2014 May 27.

Abstract

G-quadruplexes are widespread four-stranded structures that are adopted by G-rich regions of both DNA and RNA and are involved in essential biological processes such as mRNA translation. They are formed by the stacking of two or more G-quartets that are linked together by three loops. Although the maximal loop length is usually fixed to 7 nt in most G-quadruplex-predicting software, it has already been demonstrated that artificial DNA G-quadruplexes containing two distal loops that are limited to 1 nt each and a central loop up to 30 nt long are likely to form in vitro. This report demonstrates that such structures possessing a long central loop are actually found in the 5'-UTRs of human mRNAs. Firstly, 1453 potential G-quadruplex-forming sequences (PG4s) were identified through a bioinformatic survey that searched for sequences respecting the requirement for two 1-nt long distal loops and a long central loop of 2-90 nt in length. Secondly, in vitro in-line probing experiments confirmed and characterized the folding of eight candidates possessing central loops of 10-70 nt long. Finally, the biological effect of several G-quadruplexes with a long central loop on mRNA expression was studied in cellulo using a luciferase gene reporter assay. Clearly, the actual definition of G-quadruplex-forming sequences is too conservative and must be expanded to include the long central loop. This greatly expands the number of expected PG4s in the transcriptome. Consideration of these new candidates might aid in elucidating the potentially important biological implications of the G-quadruplex structure.

Keywords: 5′-UTR; G-quadruplex; RNA structure; in-line probing; translation regulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 5' Untranslated Regions*
  • Base Sequence
  • Cell Cycle Proteins / genetics
  • DNA-Binding Proteins / genetics
  • G-Quadruplexes*
  • Histone Chaperones / genetics
  • Humans
  • MDS1 and EVI1 Complex Locus Protein
  • Molecular Sequence Data
  • Nucleic Acid Conformation*
  • Protein Biosynthesis
  • Proto-Oncogenes / genetics
  • RNA Folding
  • RNA Processing, Post-Transcriptional*
  • RNA, Messenger / chemistry
  • RNA, Messenger / metabolism
  • Transcription Factors / genetics

Substances

  • 5' Untranslated Regions
  • BCL2-associated athanogene 1 protein
  • Cell Cycle Proteins
  • DNA-Binding Proteins
  • HIRA protein, human
  • Histone Chaperones
  • MDS1 and EVI1 Complex Locus Protein
  • MECOM protein, human
  • RNA, Messenger
  • Transcription Factors