CODA: a combo-Seq data analysis workflow

Brief Bioinform. 2023 Jan 19;24(1):bbac582. doi: 10.1093/bib/bbac582.

Abstract

The analysis of the combined mRNA and miRNA content of a biological sample can be of interest for answering several research questions, like biomarkers discovery, or mRNA-miRNA interactions. However, the process is costly and time-consuming, separate libraries need to be prepared and sequenced on different flowcells. Combo-Seq is a library prep kit that allows us to prepare combined mRNA-miRNA libraries starting from very low total RNA. To date, no dedicated bioinformatics method exists for the processing of Combo-Seq data. In this paper, we describe CODA (Combo-seq Data Analysis), a workflow specifically developed for the processing of Combo-Seq data that employs existing free-to-use tools. We compare CODA with exceRpt, the pipeline suggested by the kit manufacturer for this purpose. We also evaluate how Combo-Seq libraries analysed with CODA perform compared with conventional poly(A) and small RNA libraries prepared from the same samples. We show that using CODA more successfully trimmed reads are recovered compared with exceRpt, and the difference is more dramatic with short sequencing reads. We demonstrate how Combo-Seq identifies as many genes and fewer miRNAs compared to the standard libraries, and how miRNA validation favours conventional small RNA libraries over Combo-Seq. The CODA code is available at https://github.com/marta-nazzari/CODA.

Keywords: CODA; Combo-Seq; RNA-Seq; exceRpt; mRNA; miRNA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Analysis
  • High-Throughput Nucleotide Sequencing / methods
  • MicroRNAs* / genetics
  • RNA, Messenger / genetics
  • Sequence Analysis, RNA / methods
  • Workflow

Substances

  • MicroRNAs
  • RNA, Messenger