A classification-based approach to semi-supervised clustering with pairwise constraints

Neural Netw. 2020 Jul:127:193-203. doi: 10.1016/j.neunet.2020.04.017. Epub 2020 Apr 25.

Abstract

In this paper, we introduce a neural network framework for semi-supervised clustering with pairwise (must-link or cannot-link) constraints. In contrast to existing approaches, we decompose semi-supervised clustering into two simpler classification tasks: the first stage uses a pair of Siamese neural networks to label the unlabeled pairs of points as must-link or cannot-link; the second stage uses the fully pairwise-labeled dataset produced by the first stage in a supervised neural-network-based clustering method. The proposed approach is motivated by the observation that binary classification (such as assigning pairwise relations) is usually easier than multi-class clustering with partial supervision. On the other hand, being classification-based, our method solves only well-defined classification problems, rather than less well specified clustering tasks. Extensive experiments on various datasets demonstrate the high performance of the proposed method.

Keywords: Deep learning; Neural networks; Pairwise constraints; Semi-supervised clustering; Siamese neural networks.

MeSH terms

  • Cluster Analysis
  • Databases, Factual / trends
  • Neural Networks, Computer*
  • Supervised Machine Learning* / trends