Implicit Annealing in Kernel Spaces: A Strongly Consistent Clustering Approach

IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):5862-5871. doi: 10.1109/TPAMI.2022.3217137. Epub 2023 Apr 3.

Abstract

Kernel k-means clustering is a powerful tool for unsupervised learning of non-linearly separable data. Its merits are thoroughly validated on a suite of simulated datasets and real data benchmarks that feature nonlinear and multi-view separation. Since the earliest attempts, researchers have noted that such algorithms often become trapped by local minima arising from the non-convexity of the underlying objective function. In this paper, we generalize recent results leveraging a general family of means to combat sub-optimal local solutions to the kernel and multi-kernel settings. Called Kernel Power k-Means, our algorithm uses majorization-minimization (MM) to better solve this non-convex problem. We show that the method implicitly performs annealing in kernel feature space while retaining efficient, closed-form updates. We rigorously characterize its convergence properties both from computational and statistical points of view. In particular, we characterize the large sample behavior of the proposed method by establishing strong consistency guarantees as well as finite-sample bounds on the excess risk of the estimates through modern tools in learning theory. The proposal's efficacy is demonstrated through an array of simulated and real data experiments.