RNA splicing analysis using heterogeneous and large RNA-seq datasets

Jorge Vaquero-Garcia; Joseph K Aicher; San Jewell; Matthew R Gazzara; Caleb M Radens; Anupama Jha; Scott S Norton; Nicholas F Lahens; Gregory R Grant; Yoseph Barash

doi:10.1038/s41467-023-36585-y

RNA splicing analysis using heterogeneous and large RNA-seq datasets

Nat Commun. 2023 Mar 3;14(1):1230. doi: 10.1038/s41467-023-36585-y.

Authors

Jorge Vaquero-Garcia^#¹, Joseph K Aicher^#^{1

2}, San Jewell^#¹, Matthew R Gazzara^#¹, Caleb M Radens¹, Anupama Jha³, Scott S Norton¹, Nicholas F Lahens⁴, Gregory R Grant^{1

4}, Yoseph Barash^{5

6}

Affiliations

¹ Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
² Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
³ Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA.
⁴ Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.
⁵ Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA. yosephb@upenn.edu.
⁶ Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA. yosephb@upenn.edu.

^# Contributed equally.

Abstract

The ubiquity of RNA-seq has led to many methods that use RNA-seq data to analyze variations in RNA splicing. However, available methods are not well suited for handling heterogeneous and large datasets. Such datasets scale to thousands of samples across dozens of experimental conditions, exhibit increased variability compared to biological replicates, and involve thousands of unannotated splice variants resulting in increased transcriptome complexity. We describe here a suite of algorithms and tools implemented in the MAJIQ v2 package to address challenges in detection, quantification, and visualization of splicing variations from such datasets. Using both large scale synthetic data and GTEx v8 as benchmark datasets, we assess the advantages of MAJIQ v2 compared to existing methods. We then apply MAJIQ v2 package to analyze differential splicing across 2,335 samples from 13 brain subregions, demonstrating its ability to offer insights into brain subregion-specific splicing regulation.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Benchmarking
Brain
RNA Splicing*
RNA-Seq

Abstract

Publication types

MeSH terms

Grants and funding