Deep single-cell RNA-seq data clustering with graph prototypical contrastive learning

Bioinformatics. 2023 Jun 1;39(6):btad342. doi: 10.1093/bioinformatics/btad342.

Abstract

Motivation: Single-cell RNA sequencing enables researchers to study cellular heterogeneity at single-cell level. To this end, identifying cell types of cells with clustering techniques becomes an important task for downstream analysis. However, challenges of scRNA-seq data such as pervasive dropout phenomena hinder obtaining robust clustering outputs. Although existing studies try to alleviate these problems, they fall short of fully leveraging the relationship information and mainly rely on reconstruction-based losses that highly depend on the data quality, which is sometimes noisy.

Results: This work proposes a graph-based prototypical contrastive learning method, named scGPCL. Specifically, scGPCL encodes the cell representations using Graph Neural Networks on cell-gene graph that captures the relational information inherent in scRNA-seq data and introduces prototypical contrastive learning to learn cell representations by pushing apart semantically dissimilar pairs and pulling together similar ones. Through extensive experiments on both simulated and real scRNA-seq data, we demonstrate the effectiveness and efficiency of scGPCL.

Availability and implementation: Code is available at https://github.com/Junseok0207/scGPCL.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Gene Expression Profiling*
  • Sequence Analysis, RNA
  • Single-Cell Analysis / methods
  • Single-Cell Gene Expression Analysis
  • Software*