An interpretable model of pre-mRNA splicing for animal and plant genes

Sci Adv. 2024 May 10;10(19):eadn1547. doi: 10.1126/sciadv.adn1547. Epub 2024 May 8.

Abstract

Pre-mRNA splicing is a fundamental step in gene expression, conserved across eukaryotes, in which the spliceosome recognizes motifs at the 3' and 5' splice sites (SSs), excises introns, and ligates exons. SS recognition and pairing is often influenced by protein splicing factors (SFs) that bind to splicing regulatory elements (SREs). Here, we describe SMsplice, a fully interpretable model of pre-mRNA splicing that combines models of core SS motifs, SREs, and exonic and intronic length preferences. We learn models that predict SS locations with 83 to 86% accuracy in fish, insects, and plants and about 70% in mammals. Learned SRE motifs include both known SF binding motifs and unfamiliar motifs, and both motif classes are supported by genetic analyses. Our comparisons across species highlight similarities between non-mammals, increased reliance on intronic SREs in plant splicing, and a greater reliance on SREs in mammalian splicing.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Exons* / genetics
  • Genes, Plant
  • Humans
  • Introns* / genetics
  • Models, Genetic
  • Plants / genetics
  • RNA Precursors* / genetics
  • RNA Precursors* / metabolism
  • RNA Splice Sites*
  • RNA Splicing Factors / genetics
  • RNA Splicing Factors / metabolism
  • RNA Splicing*
  • Spliceosomes / genetics
  • Spliceosomes / metabolism

Substances

  • RNA Precursors
  • RNA Splice Sites
  • RNA Splicing Factors