Improving cell type identification with Gaussian noise-augmented single-cell RNA-seq contrastive learning

Brief Funct Genomics. 2024 Jan 18:elad059. doi: 10.1093/bfgp/elad059. Online ahead of print.

Abstract

Cell type identification is an important task for single-cell RNA-sequencing (scRNA-seq) data analysis. Many prediction methods have recently been proposed, but the predictive accuracy of difficult cell type identification tasks is still low. In this work, we proposed a novel Gaussian noise augmentation-based scRNA-seq contrastive learning method (GsRCL) to learn a type of discriminative feature representations for cell type identification tasks. A large-scale computational evaluation suggests that GsRCL successfully outperformed other state-of-the-art predictive methods on difficult cell type identification tasks, while the conventional random genes masking augmentation-based contrastive learning method also improved the accuracy of easy cell type identification tasks in general.

Keywords: Cell type identification; Contrastive learning; Data augmentation; scRNA-seq.