Profiling lariat intermediates reveals genetic determinants of early and late co-transcriptional splicing

Mol Cell. 2022 Dec 15;82(24):4681-4699.e8. doi: 10.1016/j.molcel.2022.11.004. Epub 2022 Nov 25.

Abstract

Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.

Keywords: CoLa-seq; GC content; U2AF; branch point; co-transcriptional splicing; exon definition; intron definition; lariat RNAs; modeling; polypyrimidine tract.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Alternative Splicing*
  • Base Sequence
  • Exons / genetics
  • Humans
  • Introns / genetics
  • RNA Splicing*