Transforming Complex Problems Into K-Means Solutions

Hongfu Liu; Junxiang Chen; Jennifer Dy; Yun Fu

doi:10.1109/TPAMI.2023.3237667

Transforming Complex Problems Into K-Means Solutions

IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):9149-9168. doi: 10.1109/TPAMI.2023.3237667. Epub 2023 Jun 5.

Authors

Hongfu Liu, Junxiang Chen, Jennifer Dy, Yun Fu

PMID: 37021920
PMCID: PMC10332815 (available on 2024-07-01)
DOI: 10.1109/TPAMI.2023.3237667

Abstract

K-means is a fundamental clustering algorithm widely used in both academic and industrial applications. Its popularity can be attributed to its simplicity and efficiency. Studies show the equivalence of K-means to principal component analysis, non-negative matrix factorization, and spectral clustering. However, these studies focus on standard K-means with squared euclidean distance. In this review paper, we unify the available approaches in generalizing K-means to solve challenging and complex problems. We show that these generalizations can be seen from four aspects: data representation, distance measure, label assignment, and centroid updating. As concrete applications of transforming problems into modified K-means formulation, we review the following applications: iterative subspace projection and clustering, consensus clustering, constrained clustering, domain adaptation, and outlier detection.

Grants and funding

U01 HL089856/HL/NHLBI NIH HHS/United States