Partition Level Constrained Clustering

IEEE Trans Pattern Anal Mach Intell. 2018 Oct;40(10):2469-2483. doi: 10.1109/TPAMI.2017.2763945. Epub 2017 Oct 17.

Abstract

Constrained clustering uses pre-given knowledge to improve the clustering performance. Here we use a new constraint called partition level side information and propose the Partition Level Constrained Clustering (PLCC) framework, where only a small proportion of the data is given labels to guide the procedure of clustering. Our goal is to find a partition which captures the intrinsic structure from the data itself, and also agrees with the partition level side information. Then we derive the algorithm of partition level side information based on K-means and give its corresponding solution. Further, we extend it to handle multiple side information and design the algorithm of partition level side information for spectral clustering. Extensive experiments demonstrate the effectiveness and efficiency of our method compared to pairwise constrained clustering and ensemble clustering methods, even in the inconsistent cluster number setting, which verifies the superiority of partition level side information to pairwise constraints. Besides, our method has high robustness to noisy side information, and we also validate the performance of our method with multiple side information. Finally, the image cosegmentation application based on saliency-guided side information demonstrates the effectiveness of PLCC as a flexible framework in different domains, even with the unsupervised side information.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.