Pseudo-Label Guided Collective Matrix Factorization for Multiview Clustering

IEEE Trans Cybern. 2022 Sep;52(9):8681-8691. doi: 10.1109/TCYB.2021.3051182. Epub 2022 Aug 18.

Abstract

Multiview clustering has aroused increasing attention in recent years since real-world data are always comprised of multiple features or views. Despite the existing clustering methods having achieved promising performance, there still remain some challenges to be solved: 1) most existing methods are unscalable to large-scale datasets due to the high computational burden of eigendecomposition or graph construction and 2) most methods learn latent representations and cluster structures separately. Such a two-step learning scheme neglects the correlation between the two learning stages and may obtain a suboptimal clustering result. To address these challenges, a pseudo-label guided collective matrix factorization (PLCMF) method that jointly learns latent representations and cluster structures is proposed in this article. The proposed PLCMF first performs clustering on each view separately to obtain pseudo-labels that reflect the intraview similarities of each view. Then, it adds a pseudo-label constraint on collective matrix factorization to learn unified latent representations, which preserve the intraview and interview similarities simultaneously. Finally, it intuitively incorporates latent representation learning and cluster structure learning into a joint framework to directly obtain clustering results. Besides, the weight of each view is learned adaptively according to data distribution in the joint framework. In particular, the joint learning problem can be solved with an efficient iterative updating method with linear complexity. Extensive experiments on six benchmark datasets indicate the superiority of the proposed method over state-of-the-art multiview clustering methods in both clustering accuracy and computational efficiency.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Learning*