On Consistent Entropy-Regularized k-Means Clustering With Feature Weight Learning: Algorithm and Statistical Analyses

IEEE Trans Cybern. 2023 Aug;53(8):4779-4790. doi: 10.1109/TCYB.2022.3166975. Epub 2023 Jul 18.

Abstract

Clusters in real data are often restricted to low-dimensional subspaces rather than the entire feature space. Recent approaches to circumvent this difficulty are often computationally inefficient and lack theoretical justification in terms of their large-sample behavior. This article deals with the problem by introducing an entropy incentive term to efficiently learn the feature importance within the framework of center-based clustering. A scalable block-coordinate descent algorithm, with closed-form updates, is incorporated to minimize the proposed objective function. We establish theoretical guarantees on our method by Vapnik-Chervonenkis (VC) theory to establish strong consistency along with uniform concentration bounds. The merits of our method are showcased through detailed experimental analysis on toy examples as well as real data clustering benchmarks.