Divergence-Based Locally Weighted Ensemble Clustering with Dictionary Learning and L2,1-Norm

Jiaxuan Xu; Jiang Wu; Taiyong Li; Yang Nan

doi:10.3390/e24101324

Divergence-Based Locally Weighted Ensemble Clustering with Dictionary Learning and L_2,1-Norm

Entropy (Basel). 2022 Sep 21;24(10):1324. doi: 10.3390/e24101324.

Authors

Jiaxuan Xu¹, Jiang Wu¹, Taiyong Li¹, Yang Nan²

Affiliations

¹ School of Computing and Artificial Intelligence, Southwestern University of Finance and Economics, Chengdu 611130, China.
² Department of Computer Science, Harbin Finance University, Harbin 150030, China.

Abstract

Accurate clustering is a challenging task with unlabeled data. Ensemble clustering aims to combine sets of base clusterings to obtain a better and more stable clustering and has shown its ability to improve clustering accuracy. Dense representation ensemble clustering (DREC) and entropy-based locally weighted ensemble clustering (ELWEC) are two typical methods for ensemble clustering. However, DREC treats each microcluster equally and hence, ignores the differences between each microcluster, while ELWEC conducts clustering on clusters rather than microclusters and ignores the sample-cluster relationship. To address these issues, a divergence-based locally weighted ensemble clustering with dictionary learning (DLWECDL) is proposed in this paper. Specifically, the DLWECDL consists of four phases. First, the clusters from the base clustering are used to generate microclusters. Second, a Kullback-Leibler divergence-based ensemble-driven cluster index is used to measure the weight of each microcluster. With these weights, an ensemble clustering algorithm with dictionary learning and the L2,1-norm is employed in the third phase. Meanwhile, the objective function is resolved by optimizing four subproblems and a similarity matrix is learned. Finally, a normalized cut (Ncut) is used to partition the similarity matrix and the ensemble clustering results are obtained. In this study, the proposed DLWECDL was validated on 20 widely used datasets and compared to some other state-of-the-art ensemble clustering methods. The experimental results demonstrated that the proposed DLWECDL is a very promising method for ensemble clustering.

Keywords: L2,1-norm; clustering; dictionary learning; ensemble clustering; similarity; subspace clustering.

Abstract

Grants and funding