Simulation, power evaluation and sample size recommendation for single-cell RNA-seq

Bioinformatics. 2020 Dec 8;36(19):4860-4868. doi: 10.1093/bioinformatics/btaa607.

Abstract

Motivation: Determining the sample size for adequate power to detect statistical significance is a crucial step at the design stage for high-throughput experiments. Even though a number of methods and tools are available for sample size calculation for microarray and RNA-seq in the context of differential expression (DE), this topic in the field of single-cell RNA sequencing is understudied. Moreover, the unique data characteristics present in scRNA-seq such as sparsity and heterogeneity increase the challenge.

Results: We propose POWSC, a simulation-based method, to provide power evaluation and sample size recommendation for single-cell RNA-sequencing DE analysis. POWSC consists of a data simulator that creates realistic expression data, and a power assessor that provides a comprehensive evaluation and visualization of the power and sample size relationship. The data simulator in POWSC outperforms two other state-of-art simulators in capturing key characteristics of real datasets. The power assessor in POWSC provides a variety of power evaluations including stratified and marginal power analyses for DEs characterized by two forms (phase transition or magnitude tuning), under different comparison scenarios. In addition, POWSC offers information for optimizing the tradeoffs between sample size and sequencing depth with the same total reads.

Availability and implementation: POWSC is an open-source R package available online at https://github.com/suke18/POWSC.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computer Simulation
  • Gene Expression Profiling
  • RNA-Seq
  • Sample Size
  • Sequence Analysis, RNA
  • Single-Cell Analysis*
  • Software*