Purine-rich low complexity regions are potential RNA binding hubs in the human genome

F1000Res. 2018 Jan 17:7:76. doi: 10.12688/f1000research.13522.2. eCollection 2018.

Abstract

Many long noncoding RNAs are bound to the chromatin and some of these interactions are mediated by triple helices. It is usually assumed that a transcript can form triplexes with a distinct set of genomic loci also known as triplex target sites (TTSs). Here we performed computational analyses of the TTSs that have been experimentally identified for particular RNAs. To assess the ability of these TTSs to bind other transcripts we developed a method to estimate the statistical significance of the predicted number of triplexes for a given RNA-DNA pair. We demonstrated that each DNA set included a subset of sequences that have a potential to form a statistically significant (adjusted p-value < 0.01) number of triplexes with the majority (>90%) of the analyzed transcripts. Due to the predicted ability of these DNA sequences to interact with a wide range of different RNAs, we called them "universal TTSs". While the universal TTSs were quite rare in the human genome (around 0.5%), they were more frequent (>15%) among the MEG3 binding sites (ChOP-seq peaks) and especially among the shared Capture-seq peaks (40%). The universal TTSs were enriched with the purine-rich low complexity regions. Nowadays, the role of the chromatin bound RNAs in the formation of 3D chromatin structure is actively discussed. We speculated that such universal TTSs may contribute to establishing long-distance chromosomal contacts and may facilitate distal enhancer-promoter interactions. All the scripts and the data files related to this study are available at: https://github.com/vanya-antonov/universal_tts.

Keywords: MEG3 lncRNA; triple helix; triplex target sites (TTS).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Binding Sites*
  • Genome, Human*
  • Humans
  • Models, Theoretical
  • Purine Nucleotides*
  • RNA / genetics*
  • RNA, Long Noncoding / genetics
  • Regulatory Sequences, Nucleic Acid

Substances

  • Purine Nucleotides
  • RNA, Long Noncoding
  • RNA

Grants and funding

This work was supported by the Russian Science Foundation [grant 14-15-30002].