Learning Deep Generative Clustering via Mutual Information Maximization

Xiaojiang Yang; Junchi Yan; Yu Cheng; Yizhe Zhang

doi:10.1109/TNNLS.2021.3135375

Learning Deep Generative Clustering via Mutual Information Maximization

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):6263-6275. doi: 10.1109/TNNLS.2021.3135375. Epub 2023 Sep 1.

Authors

Xiaojiang Yang, Junchi Yan, Yu Cheng, Yizhe Zhang

PMID: 34982697
DOI: 10.1109/TNNLS.2021.3135375

Abstract

Deep clustering refers to joint representation learning and clustering using deep neural networks. Existing methods can be mainly categorized into two types: discriminative and generative methods. The former learns representations for clustering with discriminative mechanisms directly, and the latter estimate the latent distribution of each cluster for generating data points and then infers cluster assignments. Although generative methods have the advantage of estimating the latent distributions of clusters, their performances still significantly fall behind discriminative methods. In this work, we argue that this performance gap might be partly due to the overlap of data distribution of different clusters. In fact, there is little guarantee of generative methods to separate the distributions of different clusters in the data space. To tackle these problems, we theoretically prove that mutual information maximization promotes the separation of different clusters in the data space, which provides a theoretical justification for deep generative clustering with mutual information maximization. Our theoretical analysis directly leads to a model which integrates a hierarchical generative adversarial network and mutual information maximization. Moreover, we further propose three techniques and empirically show their effects to stabilize and enhance the model. The proposed approach notably outperforms other generative models for deep clustering on public benchmarks.