Roar: detecting alternative polyadenylation with standard mRNA sequencing libraries

BMC Bioinformatics. 2016 Oct 18;17(1):423. doi: 10.1186/s12859-016-1254-8.

Abstract

Background: Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate.

Results: We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach.

Conclusions: We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.

Keywords: 3’ UTR; Bioconductor; Polyadenylation; RNA-sequencing; Software.

MeSH terms

  • 3' Untranslated Regions / genetics*
  • Algorithms*
  • Brain / metabolism*
  • Breast Neoplasms / genetics*
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Polyadenylation / genetics*
  • RNA, Messenger / genetics*
  • Reproducibility of Results

Substances

  • 3' Untranslated Regions
  • RNA, Messenger