On the impact of dissimilarity measure in k-modes clustering algorithm

IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):503-7. doi: 10.1109/TPAMI.2007.53.

Abstract

This correspondence describes extensions to the k-modes algorithm for clustering categorical data. By modifying a simple matching dissimilarity measure for categorical objects, a heuristic approach was developed in [4], [12] which allows the use of the k-modes paradigm to obtain a cluster with strong intrasimilarity and to efficiently cluster large categorical data sets. The main aim of this paper is to rigorously derive the updating formula of the k-modes clustering algorithm with the new dissimilarity measure and the convergence of the algorithm under the optimization framework.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artifacts*
  • Artificial Intelligence*
  • Cluster Analysis*
  • Information Storage and Retrieval / methods*
  • Numerical Analysis, Computer-Assisted
  • Pattern Recognition, Automated / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity