Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies

Nat Commun. 2020 Mar 27;11(1):1585. doi: 10.1038/s41467-020-15298-6.

Abstract

Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Here, we develop an integrative and scalable computational method, iDEA, to perform joint DE and GSE analysis through a hierarchical Bayesian framework. By integrating DE and GSE analyses, iDEA can improve the power and consistency of DE analysis and the accuracy of GSE analysis. Importantly, iDEA uses only DE summary statistics as input, enabling effective data modeling through complementing and pairing with various existing DE methods. We illustrate the benefits of iDEA with extensive simulations. We also apply iDEA to analyze three scRNA-seq data sets, where iDEA achieves up to five-fold power gain over existing GSE methods and up to 64% power gain over existing DE methods. The power gain brought by iDEA allows us to identify many pathways that would not be identified by existing approaches in these data.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Computer Simulation
  • Endoderm / cytology
  • Endothelial Cells / metabolism
  • Gene Expression Profiling*
  • Gene Expression Regulation*
  • Human Embryonic Stem Cells / metabolism
  • Humans
  • Mice
  • Models, Genetic
  • RNA-Seq*
  • Sensory Receptor Cells / metabolism
  • Single-Cell Analysis*
  • Statistics as Topic*