Centerless Clustering

IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):167-181. doi: 10.1109/TPAMI.2022.3150981. Epub 2022 Dec 5.

Abstract

Although lots of clustering models have been proposed recently, k-means and the family of spectral clustering methods are both still drawing a lot of attention due to their simplicity and efficacy. We first reviewed the unified framework of k-means and graph cut models, and then proposed a clustering method called k-sums where a k-nearest neighbor ( k-NN) graph is adopted. The main idea of k-sums is to minimize directly the sum of the distances between points in the same cluster. To deal with the situation where the graph is unavailable, we proposed k-sums-x that takes features as input. The computational and memory overhead of k-sums are both O(nk), indicating that it can scale linearly w.r.t. the number of objects to group. Moreover, the costs of computational and memory are Irrelevant to the product of the number of points and clusters. The computational and memory complexity of k-sums-x are both linear w.r.t. the number of points. To validate the advantage of k-sums and k-sums-x on facial datasets, extensive experiments have been conducted on 10 synthetic datasets and 17 benchmark datasets. While having a low time complexity, the performance of k-sums is comparable with several state-of-the-art clustering methods.