KGLRR: A low-rank representation K-means with graph regularization constraint method for Single-cell type identification

Comput Biol Chem. 2023 Jun:104:107862. doi: 10.1016/j.compbiolchem.2023.107862. Epub 2023 Apr 3.

Abstract

Single-cell RNA sequencing technology provides a tremendous opportunity for studying disease mechanisms at the single-cell level. Cell type identification is a key step in the research of disease mechanisms. Many clustering algorithms have been proposed to identify cell types. Most clustering algorithms perform similarity calculation before cell clustering. Because clustering and similarity calculation are independent, a low-rank matrix obtained only by similarity calculation may be unable to fully reveal the patterns in single-cell data. In this study, to capture accurate single-cell clustering information, we propose a novel method based on a low-rank representation model, called KGLRR, that combines the low-rank representation approach with K-means clustering. The cluster centroid is updated as the cell dimension decreases to better from new clusters and improve the quality of clustering information. In addition, the low-rank representation model ignores local geometric information, so the graph regularization constraint is introduced. KGLRR is tested on both simulated and real single-cell datasets to validate the effectiveness of the new method. The experimental results show that KGLRR is more robust and accurate in cell type identification than other advanced algorithms.

Keywords: Graph regularization; K-means; Low rank regularization; Single-cell RNA sequencing data; Subspace clustering.

MeSH terms

  • Algorithms*
  • Cluster Analysis