A transcriptome software comparison for the analyses of treatments expected to give subtle gene expression responses

BMC Genomics. 2022 Jun 20;23(1):452. doi: 10.1186/s12864-022-08673-8.

Abstract

Background: In this comparative study we evaluate the performance of four software tools: DNAstar-D (DESeq2), DNAstar-E (edgeR), CLC Genomics and Partek Flow for identification of differentially expressed genes (DEGs) using a transcriptome of E. coli. The RNA-seq data are from the effect of below-background radiation 5.5 nGy total dose (0.2nGy/hr) on E. coli grown shielded from natural radiation 655 m below ground in a pre-World War II steel vault. The gene expression response to three supplemented sources of radiation designed to mimic natural background, 1952 - 5720 nGy in total dose (71-208 nGy/hr), are compared to this "radiation-deprived" treatment. In addition, RNA-seq data of Caenorhabditis elegans nematode from similar radiation treatments was analyzed by three of the software packages.

Results: In E. coli, the four software programs identified one of the supplementary sources of radiation (KCl) to evoke about 5 times more transcribed genes than the minus-radiation treatment (69-114 differentially expressed genes, DEGs), and so the rest of the analyses used this KCl vs "Minus" comparison. After imposing a 30-read minimum cutoff, one of the DNAStar options shared two of the three steps (mapping, normalization, and statistic) with Partek Flow (they both used median of ratios to normalize and the DESeq2 statistical package), and these two programs identified the highest number of DEGs in common with each other (53). In contrast, when the programs used different approaches in each of the three steps, between 31 and 40 DEGs were found in common. Regarding the extent of expression differences, three of the four programs gave high fold-change results (15-178 fold), but one (DNAstar's DESeq2) resulted in more conservative fold-changes (1.5-3.5). In a parallel study comparing three qPCR commercial validation software programs, these programs also gave variable results as to which genes were significantly regulated. Similarly, the C. elegans analysis showed exaggerated fold-changes in CLC and DNAstar's edgeR while DNAstar-D was more conservative.

Conclusions: Regarding the extent of expression (fold-change), and considering the subtlety of the very low level radiation treatments, in E. coli three of the four programs gave what we consider exaggerated fold-change results (15 - 178 fold), but one (DNAstar's DESeq2) gave more realistic fold-changes (1.5-3.5). When RT-qPCR validation comparisons to transcriptome results were carried out, they supported the more conservative DNAstar-D's expression results. When another model organism's (nematode) response to these radiation differences was similarly analyzed, DNAstar-D also resulted in the most conservative expression patterns. Therefore, we would propose DESeq2 ("DNAstar-D") as an appropriate software tool for differential gene expression studies for treatments expected to give subtle transcriptome responses.

Keywords: Fold-changes; Low radiation; RNA-Seq software; Small treatment differences; Transcriptome software.

MeSH terms

  • Animals
  • Caenorhabditis elegans / genetics
  • Escherichia coli* / genetics
  • Gene Expression Profiling / methods
  • Sequence Analysis
  • Sequence Analysis, RNA / methods
  • Software
  • Transcriptome*