Alu exonization events reveal features required for precise recognition of exons by the splicing machinery

PLoS Comput Biol. 2009 Mar;5(3):e1000300. doi: 10.1371/journal.pcbi.1000300. Epub 2009 Mar 6.

Abstract

Despite decades of research, the question of how the mRNA splicing machinery precisely identifies short exonic islands within the vast intronic oceans remains to a large extent obscure. In this study, we analyzed Alu exonization events, aiming to understand the requirements for correct selection of exons. Comparison of exonizing Alus to their non-exonizing counterparts is informative because Alus in these two groups have retained high sequence similarity but are perceived differently by the splicing machinery. We identified and characterized numerous features used by the splicing machinery to discriminate between Alu exons and their non-exonizing counterparts. Of these, the most novel is secondary structure: Alu exons in general and their 5' splice sites (5'ss) in particular are characterized by decreased stability of local secondary structures with respect to their non-exonizing counterparts. We detected numerous further differences between Alu exons and their non-exonizing counterparts, among others in terms of exon-intron architecture and strength of splicing signals, enhancers, and silencers. Support vector machine analysis revealed that these features allow a high level of discrimination (AUC = 0.91) between exonizing and non-exonizing Alus. Moreover, the computationally derived probabilities of exonization significantly correlated with the biological inclusion level of the Alu exons, and the model could also be extended to general datasets of constitutive and alternative exons. This indicates that the features detected and explored in this study provide the basis not only for precise exon selection but also for the fine-tuned regulation thereof, manifested in cases of alternative splicing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alu Elements*
  • Enhancer Elements, Genetic
  • Exons*
  • Introns
  • Nucleic Acid Conformation
  • RNA Splicing*
  • Spliceosomes