On the impact of dissimilarity measure in k-modes clustering algorithm

Michael K Ng; Mark Junjie Li; Joshua Zhexue Huang; Zengyou He

doi:10.1109/TPAMI.2007.53

On the impact of dissimilarity measure in k-modes clustering algorithm

IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):503-7. doi: 10.1109/TPAMI.2007.53.

Authors

Michael K Ng¹, Mark Junjie Li, Joshua Zhexue Huang, Zengyou He

Affiliation

¹ Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong. mng@math.hkbu.edu.hk

PMID: 17224620
DOI: 10.1109/TPAMI.2007.53

Abstract

This correspondence describes extensions to the k-modes algorithm for clustering categorical data. By modifying a simple matching dissimilarity measure for categorical objects, a heuristic approach was developed in [4], [12] which allows the use of the k-modes paradigm to obtain a cluster with strong intrasimilarity and to efficiently cluster large categorical data sets. The main aim of this paper is to rigorously derive the updating formula of the k-modes clustering algorithm with the new dissimilarity measure and the convergence of the algorithm under the optimization framework.

Publication types

Comparative Study
Evaluation Study
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Artifacts*
Artificial Intelligence*
Cluster Analysis*
Information Storage and Retrieval / methods*
Numerical Analysis, Computer-Assisted
Pattern Recognition, Automated / methods*
Reproducibility of Results
Sensitivity and Specificity