Transcriptomic Analysis Pipeline (TAP) for quality control and functional assessment of transcriptomes

Res Sq [Preprint]. 2023 Oct 11:rs.3.rs-3390128. doi: 10.21203/rs.3.rs-3390128/v1.

Abstract

Background: RNA-sequencing (RNA-seq) has revolutionized the exploration of biological mechanisms, shedding light on the roles of non-coding RNAs, including long non-coding RNAs (lncRNAs), across various biological processes, including stress responses. Despite these advancements, there remains a gap in our understanding of the implications of different RNA-seq library protocols on comprehensive lncRNA expression analysis, particularly in non-mammalian organisms.

Results: In this study, we sought to bridge this knowledge gap by investigating lncRNA expression patterns in Drosophila melanogaster under thermal stress conditions. To achieve this, we conducted a comparative analysis of two RNA-seq library protocols: polyA + RNA capture and rRNA-depletion. Our approach involved the development and application of a Transcriptome Analysis Pipeline (TAP) designed to systematically assess both the technical and functional dimensions of RNA-seq, facilitating a robust comparison of these library protocols. Our findings underscore the efficacy of the polyA + protocol in capturing the majority of expressed lncRNAs within the Drosophila melanogaster transcriptome. In contrast, rRNA-depletion exhibited limited advantages in the context of D. melanogaster studies. Notably, the polyA + protocol demonstrated superior performance in terms of usable read yield and the accurate detection of splice junctions.

Conclusions: Our study introduces a versatile transcriptomic analysis pipeline, TAP, designed to uniformly process RNA-seq data from any organism with a reference genome. It also highlights the significance of selecting an appropriate RNA-seq library protocol tailored to the specific research context.

Background: Advances in next generation sequencing (NGS) technologies enable the comprehensive analysis of genetic sequences of organisms in a relatively cost-effective manner [1, 2]. Among these technologies, RNA-sequencing (RNA-seq) has emerged as a preeminent method to study fundamental biological mechanisms at the level of cells, tissues, and whole organisms. RNA-seq enables the detection and quantification of various RNA populations, including messenger RNA (mRNA) and various species of non-coding RNA, such as long non-coding RNA (lncRNA), as well as an assessment of features including splice junctions in RNA.

Keywords: RNA-seq; long non-coding RNA; polyA + selection; rRNA-depletion; splicing; thermal stress; transcriptome.

Publication types

  • Preprint