propeller: testing for differences in cell type proportions in single cell data

Bioinformatics. 2022 Oct 14;38(20):4720-4726. doi: 10.1093/bioinformatics/btac582.

Abstract

Motivation: Single cell RNA-Sequencing (scRNA-seq) has rapidly gained popularity over the last few years for profiling the transcriptomes of thousands to millions of single cells. This technology is now being used to analyse experiments with complex designs including biological replication. One question that can be asked from single cell experiments, which has been difficult to directly address with bulk RNA-seq data, is whether the cell type proportions are different between two or more experimental conditions. As well as gene expression changes, the relative depletion or enrichment of a particular cell type can be the functional consequence of disease or treatment. However, cell type proportion estimates from scRNA-seq data are variable and statistical methods that can correctly account for different sources of variability are needed to confidently identify statistically significant shifts in cell type composition between experimental conditions.

Results: We have developed propeller, a robust and flexible method that leverages biological replication to find statistically significant differences in cell type proportions between groups. Using simulated cell type proportions data, we show that propeller performs well under a variety of scenarios. We applied propeller to test for significant changes in cell type proportions related to human heart development, ageing and COVID-19 disease severity.

Availability and implementation: The propeller method is publicly available in the open source speckle R package (https://github.com/phipsonlab/speckle). All the analysis code for the article is available at the associated analysis website: https://phipsonlab.github.io/propeller-paper-analysis/. The speckle package, analysis scripts and datasets have been deposited at https://doi.org/10.5281/zenodo.7009042.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19*
  • Gene Expression Profiling
  • Humans
  • RNA
  • Sequence Analysis, RNA
  • Single-Cell Analysis*
  • Software

Substances

  • RNA