Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench

RNA. 2017 Jun;23(6):823-835. doi: 10.1261/rna.059360.116. Epub 2017 Mar 13.

Abstract

Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this.

Keywords: UEA sRNA Workbench, quality checking; differential expression; high-throughput sequencing (HTS); microRNA (miRNA); normalization; small RNA (sRNA).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology* / methods
  • Computational Biology* / standards
  • Gene Expression Regulation*
  • High-Throughput Nucleotide Sequencing* / methods
  • High-Throughput Nucleotide Sequencing* / standards
  • RNA, Small Untranslated*
  • Reproducibility of Results
  • Sequence Analysis, RNA* / methods
  • Sequence Analysis, RNA* / standards
  • Software*
  • Workflow

Substances

  • RNA, Small Untranslated