Alternative strategies for development of a reference transcriptome for quantification of allele specific expression in organisms having sparse genomic resources

Comp Biochem Physiol Part D Genomics Proteomics. 2013 Mar;8(1):11-6. doi: 10.1016/j.cbd.2012.10.006. Epub 2012 Nov 9.

Abstract

In recent years RNA-Seq technology has been used not only to quantify differences in gene expression but also to understand the underlying mechanisms that lead to these differences. Nucleotide sequence variation arising through evolution may differentially affect the expression profiles of divergent species. RNA-Seq technology, combined with techniques to differentiate parental alleles and quantify their abundance, have recently become popular methods for allele specific gene expression (ASGE) analyses. However, analysis of gene expression within interspecies hybrids may be difficult when one of the two parental genomes represented in the hybrid does not have robust genomic resources or available transcriptome data. Herein, we compare two strategies for analyzing allele specific expression within interspecies hybrids produced from crossing two Xiphophorus fish species. The first strategy relies upon a robust reference transcriptome assembly from one species followed by identification of SNPs and creation of an in silico reference transcriptome for the second species. The second strategy employs de novo assembly of reference transcriptomes for both parental species followed by identification of homologous transcripts prior to mapping hybrid reads to a combined hybrid reference. Our results show that, although both methods are able to achieve balanced allelic distribution upon read mapping of F(1) hybrid fish transcriptomes, the second "de novo" assembly approach is superior for ASGE analyses and leads to results more consistent with those found from quantitative real time PCR assessment of gene expression. In addition, our analysis indicates that indels between the two parental alleles are the major cause of the differences in results observed when employing these two methods.

Publication types

  • Research Support, American Recovery and Reinvestment Act
  • Research Support, N.I.H., Extramural

MeSH terms

  • Alleles*
  • Animals
  • Computer Simulation
  • Cyprinodontiformes / genetics*
  • Cyprinodontiformes / metabolism
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation*
  • Genomics / methods
  • Hybridization, Genetic
  • Models, Genetic
  • Real-Time Polymerase Chain Reaction
  • Reference Values
  • Sequence Analysis, RNA
  • Species Specificity
  • Transcriptome*