An Entropy Regularization k-Means Algorithm with a New Measure of between-Cluster Distance in Subspace Clustering

Entropy (Basel). 2019 Jul 12;21(7):683. doi: 10.3390/e21070683.

Abstract

Although within-cluster information is commonly used in most clustering approaches, other important information such as between-cluster information is rarely considered in some cases. Hence, in this study, we propose a new novel measure of between-cluster distance in subspace, which is to maximize the distance between the center of a cluster and the points that do not belong to this cluster. Based on this idea, we firstly design an optimization objective function integrating the between-cluster distance and entropy regularization in this paper. Then, updating rules are given by theoretical analysis. In the following, the properties of our proposed algorithm are investigated, and the performance is evaluated experimentally using two synthetic and seven real-life datasets. Finally, the experimental studies demonstrate that the results of the proposed algorithm (ERKM) outperform most existing state-of-the-art k-means-type clustering algorithms in most cases.

Keywords: between-cluster information; data mining; entropy regularization; k-means.