Functional identification of cis-regulatory long noncoding RNAs at controlled false discovery rates

Nucleic Acids Res. 2024 Apr 12;52(6):2821-2835. doi: 10.1093/nar/gkae075.

Abstract

A key attribute of some long noncoding RNAs (lncRNAs) is their ability to regulate expression of neighbouring genes in cis. However, such 'cis-lncRNAs' are presently defined using ad hoc criteria that, we show, are prone to false-positive predictions. The resulting lack of cis-lncRNA catalogues hinders our understanding of their extent, characteristics and mechanisms. Here, we introduce TransCistor, a framework for defining and identifying cis-lncRNAs based on enrichment of targets amongst proximal genes. TransCistor's simple and conservative statistical models are compatible with functionally defined target gene maps generated by existing and future technologies. Using transcriptome-wide perturbation experiments for 268 human and 134 mouse lncRNAs, we provide the first large-scale survey of cis-lncRNAs. Known cis-lncRNAs are correctly identified, including XIST, LINC00240 and UMLILO, and predictions are consistent across analysis methods, perturbation types and independent experiments. We detect cis-activity in a minority of lncRNAs, primarily involving activators over repressors. Cis-lncRNAs are detected by both RNA interference and antisense oligonucleotide perturbations. Mechanistically, cis-lncRNA transcripts are observed to physically associate with their target genes and are weakly enriched with enhancer elements. In summary, TransCistor establishes a quantitative foundation for cis-lncRNAs, opening a path to elucidating their molecular mechanisms and biological significance.

MeSH terms

  • Animals
  • Computational Biology* / methods
  • Genetic Techniques*
  • Humans
  • Mice
  • RNA, Long Noncoding* / genetics
  • RNA, Long Noncoding* / isolation & purification
  • Software / standards
  • Transcription Factors / genetics
  • Transcriptome

Substances

  • RNA, Long Noncoding
  • Transcription Factors