scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets

Bioinformatics. 2022 Apr 28;38(9):2642-2644. doi: 10.1093/bioinformatics/btac141.

Abstract

Summary: A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. Here, we present scGate, an algorithm that automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. scGate purifies a cell population of interest using a set of markers organized in a hierarchical structure, akin to gating strategies employed in flow cytometry. scGate outperforms state-of-the-art single-cell classifiers and it can be applied to multiple modalities of single-cell data (e.g. RNA-seq, ATAC-seq, CITE-seq). scGate is implemented as an R package and integrated with the Seurat framework, providing an intuitive tool to isolate cell populations of interest from heterogeneous single-cell datasets.

Availability and implementation: scGate is available as an R package at https://github.com/carmonalab/scGate (https://doi.org/10.5281/zenodo.6202614). Several reproducible workflows describing the main functions and usage of the package on different single-cell modalities, as well as the code to reproduce the benchmark, can be found at https://github.com/carmonalab/scGate.demo (https://doi.org/10.5281/zenodo.6202585) and https://github.com/carmonalab/scGate.benchmark. Test data are available at https://doi.org/10.6084/m9.figshare.16826071.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin Immunoprecipitation Sequencing
  • Exome Sequencing
  • RNA-Seq
  • Single-Cell Analysis*
  • Software*

Associated data

  • figshare/10.6084/m9.figshare.16826071