Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells

Yixin Zhao; Noah Dukler; Gilad Barshad; Shushan Toneyan; Charles G Danko; Adam Siepel

doi:10.1093/bioinformatics/btab582

Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells

Bioinformatics. 2021 Dec 11;37(24):4727-4736. doi: 10.1093/bioinformatics/btab582.

Authors

Yixin Zhao¹, Noah Dukler¹, Gilad Barshad^{2

3}, Shushan Toneyan¹, Charles G Danko^{2

3}, Adam Siepel¹

Affiliations

¹ Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
² Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA.
³ Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA.

Abstract

Motivation: Quantification of isoform abundance has been extensively studied at the mature RNA level using RNA-seq but not at the level of precursor RNAs using nascent RNA sequencing.

Results: We address this problem with a new computational method called Deconvolution of Expression for Nascent RNA-sequencing data (DENR), which models nascent RNA-sequencing read-counts as a mixture of user-provided isoforms. The baseline algorithm is enhanced by machine-learning predictions of active transcription start sites and an adjustment for the typical 'shape profile' of read-counts along a transcription unit. We show that DENR outperforms simple read-count-based methods for estimating gene and isoform abundances, and that transcription of multiple pre-RNA isoforms per gene is widespread, with frequent differences between cell types. In addition, we provide evidence that a majority of human isoform diversity derives from primary transcription rather than from post-transcriptional processes.

Availability and implementation: DENR and nascentRNASim are freely available at https://github.com/CshlSiepelLab/DENR (version v1.0.0) and https://github.com/CshlSiepelLab/nascentRNASim (version v0.3.0).

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Eukaryotic Initiation Factors / genetics
Humans
Protein Isoforms / genetics
RNA Isoforms* / genetics
RNA*
Sequence Analysis, RNA / methods
Software

Substances

RNA
RNA Isoforms
Protein Isoforms
DENR protein, human
Eukaryotic Initiation Factors

Abstract

Publication types

MeSH terms

Substances

Grants and funding