R-SAP: a multi-threading computational pipeline for the characterization of high-throughput RNA-sequencing data

Nucleic Acids Res. 2012 May;40(9):e67. doi: 10.1093/nar/gks047. Epub 2012 Jan 28.

Abstract

The rapid expansion in the quantity and quality of RNA-Seq data requires the development of sophisticated high-performance bioinformatics tools capable of rapidly transforming this data into meaningful information that is easily interpretable by biologists. Currently available analysis tools are often not easily installed by the general biologist and most of them lack inherent parallel processing capabilities widely recognized as an essential feature of next-generation bioinformatics tools. We present here a user-friendly and fully automated RNA-Seq analysis pipeline (R-SAP) with built-in multi-threading capability to analyze and quantitate high-throughput RNA-Seq datasets. R-SAP follows a hierarchical decision making procedure to accurately characterize various classes of transcripts and achieves a near linear decrease in data processing time as a result of increased multi-threading. In addition, RNA expression level estimates obtained using R-SAP display high concordance with levels measured by microarrays.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Computational Biology / methods
  • Gene Expression Profiling*
  • Genome, Human
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Sequence Alignment
  • Sequence Analysis, RNA*
  • Software*