Bioinformatics Pipeline for Transcriptome Sequencing Analysis

Methods Mol Biol. 2017:1468:201-19. doi: 10.1007/978-1-4939-4035-6_14.

Abstract

The development of High Throughput Sequencing (HTS) for RNA profiling (RNA-seq) has shed light on the diversity of transcriptomes. While RNA-seq is becoming a de facto standard for monitoring the population of expressed transcripts in a given condition at a specific time, processing the huge amount of data it generates requires dedicated bioinformatics programs. Here, we describe a standard bioinformatics protocol using state-of-the-art tools, the STAR mapper to align reads onto a reference genome, Cufflinks to reconstruct the transcriptome, and RSEM to quantify expression levels of genes and transcripts. We present the workflow using human transcriptome sequencing data from two biological replicates of the K562 cell line produced as part of the ENCODE3 project.

Keywords: Bioinformatics workflow; Protocols; RNA-seq; Transcriptome sequencing.

MeSH terms

  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • K562 Cells
  • Sequence Analysis, RNA
  • Workflow