Coordinate Descent Method for k-means

IEEE Trans Pattern Anal Mach Intell. 2022 May;44(5):2371-2385. doi: 10.1109/TPAMI.2021.3085739. Epub 2022 Apr 1.

Abstract

k-means method using Lloyd heuristic is a traditional clustering method which has played a key role in multiple downstream tasks of machine learning because of its simplicity. However, Lloyd heuristic always finds a bad local minimum, i.e., the bad local minimum makes objective function value not small enough, which limits the performance of k-means. In this paper, we use coordinate descent (CD) method to solve the problem. First, we show that the k-means minimization problem can be reformulated as a trace maximization problem, then a simple and efficient coordinate descent scheme is proposed to solve the maximization problem. Two interesting findings through theory are that Lloyd cannot decrease the objective function value of k-means produced by our CD further, and our proposed method CD to solve k-means problem can avoid produce empty clusters. In addition, according to the computational complexity analysis, it is verified CD has the same time complexity with original k-means method. Extensive experiments including statistical hypothesis testing, on several real-world datasets with varying number of clusters, varying number of samples and varying number of dimensions show that CD performs better compared to Lloyd, i.e., lower objective value, better local minimum and fewer iterations. And CD is more robust to initialization than Lloyd whether the initialization strategy is random or initialization of k-means++.