Identification and characterization of NAGNAG alternative splicing in the moss Physcomitrella patens

BMC Plant Biol. 2010 Apr 28:10:76. doi: 10.1186/1471-2229-10-76.

Abstract

Background: Alternative splicing (AS) involving tandem acceptors that are separated by three nucleotides (NAGNAG) is an evolutionarily widespread class of AS, which is well studied in Homo sapiens (human) and Mus musculus (mouse). It has also been shown to be common in the model seed plants Arabidopsis thaliana and Oryza sativa (rice). In one of the first studies involving sequence-based prediction of AS in plants, we performed a genome-wide identification and characterization of NAGNAG AS in the model plant Physcomitrella patens, a moss.

Results: Using Sanger data, we found 295 alternatively used NAGNAG acceptors in P. patens. Using 31 features and training and test datasets of constitutive and alternative NAGNAGs, we trained a classifier to predict the splicing outcome at NAGNAG tandem splice sites (alternative splicing, constitutive at the first acceptor, or constitutive at the second acceptor). Our classifier achieved a balanced specificity and sensitivity of >or= 89%. Subsequently, a classifier trained exclusively on data well supported by transcript evidence was used to make genome-wide predictions of NAGNAG splicing outcomes. By generation of more transcript evidence from a next-generation sequencing platform (Roche 454), we found additional evidence for NAGNAG AS, with altogether 664 alternative NAGNAGs being detected in P. patens using all currently available transcript evidence. The 454 data also enabled us to validate the predictions of the classifier, with 64% (80/125) of the well-supported cases of AS being predicted correctly.

Conclusion: NAGNAG AS is just as common in the moss P. patens as it is in the seed plants A. thaliana and O. sativa (but not conserved on the level of orthologous introns), and can be predicted with high accuracy. The most informative features are the nucleotides in the NAGNAG and in its immediate vicinity, along with the splice sites scores, as found earlier for NAGNAG AS in animals. Our results suggest that the mechanism behind NAGNAG AS in plants is similar to that in animals and is largely dependent on the splice site and its immediate neighborhood.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing / genetics*
  • Base Sequence
  • Bryopsida / genetics*
  • Computational Biology
  • Conserved Sequence
  • Evolution, Molecular
  • Expressed Sequence Tags
  • Humans
  • Molecular Sequence Data
  • RNA Splice Sites / genetics
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • ROC Curve
  • Reproducibility of Results
  • Terminology as Topic

Substances

  • RNA Splice Sites
  • RNA, Messenger