Deep Self-Evolution Clustering

IEEE Trans Pattern Anal Mach Intell. 2020 Apr;42(4):809-823. doi: 10.1109/TPAMI.2018.2889949. Epub 2018 Dec 27.

Abstract

Clustering is a crucial but challenging task in pattern analysis and machine learning. Existing methods often ignore the combination between representation learning and clustering. To tackle this problem, we reconsider the clustering task from its definition to develop Deep Self-Evolution Clustering (DSEC) to jointly learn representations and cluster data. For this purpose, the clustering task is recast as a binary pairwise-classification problem to estimate whether pairwise patterns are similar. Specifically, similarities between pairwise patterns are defined by the dot product between indicator features which are generated by a deep neural network (DNN). To learn informative representations for clustering, clustering constraints are imposed on the indicator features to represent specific concepts with specific representations. Since the ground-truth similarities are unavailable in clustering, an alternating iterative algorithm called Self-Evolution Clustering Training (SECT) is presented to select similar and dissimilar pairwise patterns and to train the DNN alternately. Consequently, the indicator features tend to be one-hot vectors and the patterns can be clustered by locating the largest response of the learned indicator features. Extensive experiments strongly evidence that DSEC outperforms current models on twelve popular image, text and audio datasets consistently.