Entropy subspace separation-based clustering for noise reduction (ENCORE) of scRNA-seq data

Nucleic Acids Res. 2021 Feb 22;49(3):e18. doi: 10.1093/nar/gkaa1157.

Abstract

Single-cell RNA sequencing enables us to characterize the cellular heterogeneity in single cell resolution with the help of cell type identification algorithms. However, the noise inherent in single-cell RNA-sequencing data severely disturbs the accuracy of cell clustering, marker identification and visualization. We propose that clustering based on feature density profiles can distinguish informative features from noise. We named such strategy as 'entropy subspace' separation and designed a cell clustering algorithm called ENtropy subspace separation-based Clustering for nOise REduction (ENCORE) by integrating the 'entropy subspace' separation strategy with a consensus clustering method. We demonstrate that ENCORE performs superiorly on cell clustering and generates high-resolution visualization across 12 standard datasets. More importantly, ENCORE enables identification of group markers with biological significance from a hard-to-separate dataset. With the advantages of effective feature selection, improved clustering, accurate marker identification and high-resolution visualization, we present ENCORE to the community as an important tool for scRNA-seq data analysis to study cellular heterogeneity and discover group markers.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • 3T3-L1 Cells
  • Algorithms
  • Animals
  • Cluster Analysis
  • Mice
  • RNA-Seq / methods*
  • Single-Cell Analysis / methods*