A Beginner's Guide to Analysis of RNA Sequencing Data

Am J Respir Cell Mol Biol. 2018 Aug;59(2):145-157. doi: 10.1165/rcmb.2017-0430TR.

Abstract

Since the first publications coining the term RNA-seq (RNA sequencing) appeared in 2008, the number of publications containing RNA-seq data has grown exponentially, hitting an all-time high of 2,808 publications in 2016 (PubMed). With this wealth of RNA-seq data being generated, it is a challenge to extract maximal meaning from these datasets, and without the appropriate skills and background, there is risk of misinterpretation of these data. However, a general understanding of the principles underlying each step of RNA-seq data analysis allows investigators without a background in programming and bioinformatics to critically analyze their own datasets as well as published data. Our goals in the present review are to break down the steps of a typical RNA-seq analysis and to highlight the pitfalls and checkpoints along the way that are vital for bench scientists and biomedical researchers performing experiments that use RNA-seq.

Keywords: RNA sequencing; bioinformatics; data analysis; transcriptomics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Data Analysis*
  • Gene Expression Profiling* / methods
  • High-Throughput Nucleotide Sequencing* / methods
  • Male
  • Mice, Inbred C57BL
  • Quality Control
  • Sequence Analysis, RNA* / methods
  • Software
  • Transcriptome / genetics