A survey on identification and quantification of alternative polyadenylation sites from RNA-seq data

Brief Bioinform. 2020 Jul 15;21(4):1261-1276. doi: 10.1093/bib/bbz068.

Abstract

Alternative polyadenylation (APA) has been implicated to play an important role in post-transcriptional regulation by regulating mRNA abundance, stability, localization and translation, which contributes considerably to transcriptome diversity and gene expression regulation. RNA-seq has become a routine approach for transcriptome profiling, generating unprecedented data that could be used to identify and quantify APA site usage. A number of computational approaches for identifying APA sites and/or dynamic APA events from RNA-seq data have emerged in the literature, which provide valuable yet preliminary results that should be refined to yield credible guidelines for the scientific community. In this review, we provided a comprehensive overview of the status of currently available computational approaches. We also conducted objective benchmarking analysis using RNA-seq data sets from different species (human, mouse and Arabidopsis) and simulated data sets to present a systematic evaluation of 11 representative methods. Our benchmarking study showed that the overall performance of all tools investigated is moderate, reflecting that there is still lot of scope to improve the prediction of APA site or dynamic APA events from RNA-seq data. Particularly, prediction results from individual tools differ considerably, and only a limited number of predicted APA sites or genes are common among different tools. Accordingly, we attempted to give some advice on how to assess the reliability of the obtained results. We also proposed practical recommendations on the appropriate method applicable to diverse scenarios and discussed implications and future directions relevant to profiling APA from RNA-seq data.

Keywords: 3′ untranslated region; RNA-seq; alternative polyadenylation; benchmark; predictive modeling.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Humans
  • Polyadenylation
  • Sequence Analysis, RNA / methods*