A novel batch-effect correction method for scRNA-seq data based on Adversarial Information Factorization

PLoS Comput Biol. 2024 Feb 22;20(2):e1011880. doi: 10.1371/journal.pcbi.1011880. eCollection 2024 Feb.

Abstract

Single-cell RNA sequencing (scRNA-seq) technology produces an unprecedented resolution at the level of a unique cell, raising great hopes in medicine. Nevertheless, scRNA-seq data suffer from high variations due to the experimental conditions, called batch effects, preventing any aggregated downstream analysis. Adversarial Information Factorization provides a robust batch-effect correction method that does not rely on prior knowledge of the cell types nor a specific normalization strategy while being adapted to any downstream analysis task. It compares to and even outperforms state-of-the-art methods in several scenarios: low signal-to-noise ratio, batch-specific cell types with few cells, and a multi-batches dataset with imbalanced batches and batch-specific cell types. Moreover, it best preserves the relative gene expression between cell types, yielding superior differential expression analysis results. Finally, in a more complex setting of a Leukemia cohort, our method preserved most of the underlying biological information for each patient while aligning the batches, improving the clustering metrics in the aggregated dataset.

MeSH terms

  • Algorithms
  • Benchmarking
  • Cluster Analysis
  • Exome Sequencing
  • Gene Expression Profiling
  • Humans
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Single-Cell Gene Expression Analysis*

Grants and funding

This work was supported by the Prism project, funded by the Agence Nationale de la Recherche under grant number ANR-18-IBHU-0002. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.