Donor white blood cell differential is the single largest determinant of whole blood gene expression patterns

Genomics. 2023 Nov;115(6):110708. doi: 10.1016/j.ygeno.2023.110708. Epub 2023 Sep 18.

Abstract

It has become widely accepted that sample cellular composition is a significant determinant of the gene expression patterns observed in any transcriptomic experiment performed with bulk tissue. Despite this, many investigations currently performed with whole blood do not experimentally account for possible inter-specimen differences in cellularity, and often assume that any observed gene expression differences are a result of true differences in nuclear transcription. In order to determine how confounding of an assumption this may be, in this study, we recruited a large cohort of human donors (n = 138) and used a combination of next generation sequencing and flow cytometry to quantify and compare the underlying contributions of variance in leukocyte counts versus variance in other biological factors to overall variance in whole blood transcript levels. Our results suggest that the combination of donor neutrophil and lymphocyte counts alone are the primary determinants of whole blood transcript levels for up to 75% of the protein-coding genes expressed in peripheral circulation, whereas the other factors such as age, sex, race, ethnicity, and common disease states have comparatively minimal influence. Broadly, this infers that a majority of gene expression differences observed in experiments performed with whole blood are driven by latent differences in leukocyte counts, and that cell count heterogeneity must be accounted for to meaningfully biologically interpret the results.

Keywords: Determinants of gene expression; Differential expression analysis; RNA sequencing; Transcriptomics; White blood cell count; Whole blood.

MeSH terms

  • Gene Expression Profiling
  • Humans
  • Leukocyte Count
  • Leukocytes*
  • Transcriptome*