Nonsmooth Optimization-Based Model and Algorithm for Semisupervised Clustering

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5517-5530. doi: 10.1109/TNNLS.2021.3129370. Epub 2023 Sep 1.

Abstract

Using a nonconvex nonsmooth optimization approach, we introduce a model for semisupervised clustering (SSC) with pairwise constraints. In this model, the objective function is represented as a sum of three terms: the first term reflects the clustering error for unlabeled data points, the second term expresses the error for data points with must-link (ML) constraints, and the third term represents the error for data points with cannot-link (CL) constraints. This function is nonconvex and nonsmooth. To find its optimal solutions, we introduce an adaptive SSC (A-SSC) algorithm. This algorithm is based on the combination of the nonsmooth optimization method and an incremental approach, which involves the auxiliary SSC problem. The algorithm constructs clusters incrementally starting from one cluster and gradually adding one cluster center at each iteration. The solutions to the auxiliary SSC problem are utilized as starting points for solving the nonconvex SSC problem. The discrete gradient method (DGM) of nonsmooth optimization is applied to solve the underlying nonsmooth optimization problems. This method does not require subgradient evaluations and uses only function values. The performance of the A-SSC algorithm is evaluated and compared with four benchmarking SSC algorithms on one synthetic and 12 real-world datasets. Results demonstrate that the proposed algorithm outperforms the other four algorithms in identifying compact and well-separated clusters while satisfying most constraints.