Tissue-specific transcript annotation and expression profiling with complementary next-generation sequencing technologies

Nucleic Acids Res. 2010 Sep;38(16):e165. doi: 10.1093/nar/gkq602. Epub 2010 Jul 7.

Abstract

Next-generation sequencing is excellently suited to evaluate the abundance of mRNAs to study gene expression. Here we compare two alternative technologies, cap analysis of gene expression (CAGE) and serial analysis of gene expression (SAGE), for the same RNA samples. Along with quantifying gene expression levels, CAGE can be used to identify tissue-specific transcription start sites, while SAGE monitors 3'-end usage. We used both methods to get more insight into the transcriptional control of myogenesis, studying differential gene expression in differentiated and proliferating C2C12 myoblast cells with statistical evaluation of reproducibility and differential gene expression. Both CAGE and SAGE provided highly reproducible data (Pearson's correlations >0.92 among biological triplicates). With both methods we found around 10,000 genes expressed at levels >2 transcripts per million (approximately 0.3 copies per cell), with an overlap of 86%. We identified 4304 and 3846 genes differentially expressed between proliferating and differentiated C2C12 cells by CAGE and SAGE, respectively, with an overlap of 2144. We identified 196 novel regulatory regions with preferential use in proliferating or differentiated cells. Next-generation sequencing of CAGE and SAGE libraries provides consistent expression levels and can enrich current genome annotations with tissue-specific promoters and alternative 3'-UTR usage.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions
  • Animals
  • Cell Line
  • Gene Expression Profiling / methods*
  • Mice
  • Models, Biological
  • Muscle Development / genetics
  • Myoblasts / metabolism*
  • Oligonucleotide Array Sequence Analysis
  • Reproducibility of Results
  • Sequence Alignment
  • Sequence Analysis, RNA*
  • Transcription Initiation Site

Substances

  • 3' Untranslated Regions