Scalable batch-correction approach for integrating large-scale single-cell transcriptomes

Brief Bioinform. 2022 Sep 20;23(5):bbac327. doi: 10.1093/bib/bbac327.

Abstract

Integration of accumulative large-scale single-cell transcriptomes requires scalable batch-correction approaches. Here we propose Fugue, a simple and efficient batch-correction method that is scalable for integrating super large-scale single-cell transcriptomes from diverse sources. The core idea of the method is to encode batch information as trainable parameters and add it to single-cell expression profile; subsequently, a contrastive learning approach is used to learn feature representation of the additive expression profile. We demonstrate the scalability of Fugue by integrating all single cells obtained from the Human Cell Atlas. We benchmark Fugue against current state-of-the-art methods and show that Fugue consistently achieves improved performance in terms of data alignment and clustering preservation. Our study will facilitate the integration of single-cell transcriptomes at increasingly large scale.

Keywords: data integration; deep learning; scalability; single cell.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Benchmarking
  • Cluster Analysis
  • Humans
  • Transcriptome*