Scaling up reproducible research for single-cell transcriptomics using MetaNeighbor

Nat Protoc. 2021 Aug;16(8):4031-4067. doi: 10.1038/s41596-021-00575-5. Epub 2021 Jul 7.

Abstract

Single-cell RNA-sequencing data have significantly advanced the characterization of cell-type diversity and composition. However, cell-type definitions vary across data and analysis pipelines, raising concerns about cell-type validity and generalizability. With MetaNeighbor, we proposed an efficient and robust quantification of cell-type replicability that preserves dataset independence and is highly scalable compared to dataset integration. In this protocol, we show how MetaNeighbor can be used to characterize cell-type replicability by following a simple three-step procedure: gene filtering, neighbor voting and visualization. We show how these steps can be tailored to quantify cell-type replicability, determine gene sets that contribute to cell-type identity and pretrain a model on a reference taxonomy to rapidly assess newly generated data. The protocol is based on an open-source R package available from Bioconductor and GitHub, requires basic familiarity with Rstudio or the R command line and can typically be run in <5 min for millions of cells.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Brain / cytology
  • Datasets as Topic
  • Gene Expression Regulation
  • Humans
  • Mice
  • Reproducibility of Results
  • Single-Cell Analysis / methods*
  • Software*
  • Transcriptome*

Associated data

  • figshare/10.6084/m9.figshare.13020569
  • figshare/10.6084/m9.figshare.13034171