IntEREst: intron-exon retention estimator

BMC Bioinformatics. 2018 Apr 11;19(1):130. doi: 10.1186/s12859-018-2122-5.

Abstract

Background: In-depth study of the intron retention levels of transcripts provide insights on the mechanisms regulating pre-mRNA splicing efficiency. Additionally, detailed analysis of retained introns can link these introns to post-transcriptional regulation or identify aberrant splicing events in human diseases.

Results: We present IntEREst, Intron-Exon Retention Estimator, an R package that supports rigorous analysis of non-annotated intron retention events (in addition to the ones annotated by RefSeq or similar databases), and support intra-sample in addition to inter-sample comparisons. It accepts binary sequence alignment/map (.bam) files as input and determines genome-wide estimates of intron retention or exon-exon junction levels. Moreover, it includes functions for comparing subsets of user-defined introns (e.g. U12-type vs U2-type) and its plotting functions allow visualization of the distribution of the retention levels of the introns. Statistical methods are adapted from the DESeq2, edgeR and DEXSeq R packages to extract the significantly more or less retained introns. Analyses can be performed either sequentially (on single core) or in parallel (on multiple cores). We used IntEREst to investigate the U12- and U2-type intron retention in human and plant RNAseq dataset with defects in the U12-dependent spliceosome due to mutations in the ZRSR2 component of this spliceosome. Additionally, we compared the retained introns discovered by IntEREst with that of other methods and studies.

Conclusion: IntEREst is an R package for Intron retention and exon-exon junction levels analysis of RNA-seq data. Both the human and plant analyses show that the U12-type introns are retained at higher level compared to the U2-type introns already in the control samples, but the retention is exacerbated in patient or plant samples carrying a mutated ZRSR2 gene. Intron retention events caused by ZRSR2 mutation that we discovered using IntEREst (DESeq2 based function) show considerable overlap with the retained introns discovered by other methods (e.g. IRFinder and edgeR based function of IntEREst). Our results indicate that increase in both the number of biological replicates and the depth of sequencing library promote the discovery of retained introns, but the effect of library size gradually decreases with more than 35 million reads mapped to the introns.

Keywords: Alternative splicing; Bioconductor; Expression analysis; Intron retention; RNA; RNA-seq; U12-type introns; U2-type introns.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Pairing / genetics
  • Computational Biology / methods*
  • Down-Regulation / genetics
  • Exons / genetics*
  • Genome, Human
  • Humans
  • Introns / genetics*
  • Myelodysplastic Syndromes / genetics
  • Sample Size
  • Software*
  • Up-Regulation / genetics