A benchmarking of pipelines for detecting ncRNAs from RNA-Seq data

Brief Bioinform. 2020 Dec 1;21(6):1987-1998. doi: 10.1093/bib/bbz110.

Abstract

Next-Generation Sequencing (NGS) is a high-throughput technology widely applied to genome sequencing and transcriptome profiling. RNA-Seq uses NGS to reveal RNA identities and quantities in a given sample. However, it produces a huge amount of raw data that need to be preprocessed with fast and effective computational methods. RNA-Seq can look at different populations of RNAs, including ncRNAs. Indeed, in the last few years, several ncRNAs pipelines have been developed for ncRNAs analysis from RNA-Seq experiments. In this paper, we analyze eight recent pipelines (iSmaRT, iSRAP, miARma-Seq, Oasis 2, SPORTS1.0, sRNAnalyzer, sRNApipe, sRNA workbench) which allows the analysis not only of single specific classes of ncRNAs but also of more than one ncRNA classes. Our systematic performance evaluation aims at guiding users to select the appropriate pipeline for processing each ncRNA class, focusing on three key points: (i) accuracy in ncRNAs identification, (ii) accuracy in read count estimation and (iii) deployment and ease of use.

Keywords: NGS; RNA-Seq; RNA-Seq pipeline; ncRNAs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Benchmarking*
  • Chromosome Mapping
  • Exome Sequencing
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing / methods
  • RNA
  • RNA, Untranslated* / genetics
  • RNA-Seq*
  • Sequence Analysis, RNA / methods
  • Software

Substances

  • RNA, Untranslated
  • RNA