UMI-count modeling and differential expression analysis for single-cell RNA sequencing

Genome Biol. 2018 May 31;19(1):70. doi: 10.1186/s13059-018-1438-9.

Abstract

Read counting and unique molecular identifier (UMI) counting are the principal gene expression quantification schemes used in single-cell RNA-sequencing (scRNA-seq) analysis. By using multiple scRNA-seq datasets, we reveal distinct distribution differences between these schemes and conclude that the negative binomial model is a good approximation for UMI counts, even in heterogeneous populations. We further propose a novel differential expression analysis algorithm based on a negative binomial model with independent dispersions in each group (NBID). Our results show that this properly controls the FDR and achieves better power for UMI counts when compared to other recently developed packages for scRNA-seq analysis.

Keywords: Differential expression analysis; Negative binomial; Unique molecular identifier.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers / metabolism
  • Cell Line, Tumor
  • Gene Expression Profiling / methods*
  • Humans
  • Immunologic Memory
  • Models, Statistical*
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis
  • T-Lymphocytes / immunology
  • T-Lymphocytes / metabolism

Substances

  • Biomarkers