Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads

Julianne K David; Sean K Maden; Mary A Wood; Reid F Thompson; Abhinav Nellore

doi:10.1186/s13059-022-02789-6

Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads

Genome Biol. 2022 Nov 11;23(1):240. doi: 10.1186/s13059-022-02789-6.

Authors

Julianne K David^#^{1

2

3}, Sean K Maden^#^{1

2

4}, Mary A Wood^{1

5

6}, Reid F Thompson^{7

8

9

10

11}, Abhinav Nellore^{12

13

14}

Affiliations

¹ Computational Biology Program, Oregon Health & Science University, Portland, OR, USA.
² Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
³ Base5 Genomics, Inc., Mountain View, CA, USA.
⁴ Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
⁵ Portland VA Research Foundation, Portland, OR, USA.
⁶ Phase Genomics, Inc., Seattle, WA, USA.
⁷ Computational Biology Program, Oregon Health & Science University, Portland, OR, USA. thompsre@ohsu.edu.
⁸ Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA. thompsre@ohsu.edu.
⁹ Division of Hospital and Specialty Medicine, VA Portland Healthcare System, Portland, OR, USA. thompsre@ohsu.edu.
¹⁰ Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA. thompsre@ohsu.edu.
¹¹ Department of Radiation Medicine, Oregon Health & Science University, Portland, OR, USA. thompsre@ohsu.edu.
¹² Computational Biology Program, Oregon Health & Science University, Portland, OR, USA. anellore@gmail.com.
¹³ Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA. anellore@gmail.com.
¹⁴ Department of Surgery, Oregon Health & Science University, Portland, OR, USA. anellore@gmail.com.

^# Contributed equally.

Abstract

Background: There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as well as the presence of unprocessed or partially processed RNAs.

Results: We compared introns detected by 8 tools using short RNA-seq reads with introns observed in long RNA-seq reads from the same biological specimens. We found significant disagreement among tools (Fleiss' [Formula: see text]) such that 47.7% of all detected intron retentions were not called by more than one tool. We also observed poor performance of all tools, with none achieving an F1-score greater than 0.26, and qualitatively different behaviors between general-purpose alternative splicing detection tools and tools confined to retained intron detection.

Conclusions: Short-read tools detect intron retention with poor recall and precision, calling into question the completeness and validity of a large percentage of putatively retained introns called by commonly used methods.

Keywords: Intron retention; RNA-seq; Splicing.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Alternative Splicing*
Introns
RNA-Seq
Sequence Analysis, RNA / methods
Software*