Using the Kriging Correlation for unsupervised feature selection problems

Sci Rep. 2022 Jul 7;12(1):11522. doi: 10.1038/s41598-022-15529-4.

Abstract

This paper proposes a KC Score to measure feature importance in clustering analysis of high-dimensional data. The KC Score evaluates the contribution of features based on the correlation between the original features and the reconstructed features in the low dimensional latent space. A KC Score-based feature selection strategy is further developed for clustering analysis. We investigate the performance of the proposed strategy by conducting a study of four single-cell RNA sequencing (scRNA-seq) datasets. The results show that our strategy effectively selects important features for clustering. In particular, in three datasets, our proposed strategy selected less than 5% of the features and achieved the same or better clustering performance than when using all of the features.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Spatial Analysis