scASGC: An adaptive simplified graph convolution model for clustering single-cell RNA-seq data

Comput Biol Med. 2023 Sep:163:107152. doi: 10.1016/j.compbiomed.2023.107152. Epub 2023 Jun 12.

Abstract

Single-cell RNA sequencing (scRNA-seq) is now a successful technique for identifying cellular heterogeneity, revealing novel cell subpopulations, and forecasting developmental trajectories. A crucial component of the processing of scRNA-seq data is the precise identification of cell subpopulations. Although many unsupervised clustering methods have been developed to cluster cell subpopulations, the performance of these methods is vulnerable to dropouts and high dimensionality. In addition, most existing methods are time-consuming and fail to adequately account for potential associations between cells. In the manuscript, we present an unsupervised clustering method based on an adaptive simplified graph convolution model called scASGC. The proposed method builds plausible cell graphs, aggregates neighbor information using a simplified graph convolution model, and adaptively determines the most optimal number of convolution layers for various graphs. Experiments on 12 public datasets show that scASGC outperforms both classical and state-of-the-art clustering methods. In addition, in a study of mouse intestinal muscle containing 15,983 cells, we identified distinct marker genes based on the clustering results of scASGC. The source code of scASGC is available at https://github.com/ZzzOctopus/scASGC.

Keywords: Bioinformatics; Clustering; Computational biology; Graph convolution; Machine learning; ScRNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Cluster Analysis
  • Gene Expression Profiling* / methods
  • Mice
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis / methods
  • Single-Cell Gene Expression Analysis