Fuzzy cluster stability analysis with missing values using resampling

Int J Bioinform Res Appl. 2009;5(2):207-23. doi: 10.1504/IJBRA.2009.024038.

Abstract

Exploratory data analysis is often necessary to evaluate potential hypotheses for subsequent studies such as grouping the data in clusters. In real data sets the occurrence of incompleteness is very common. We propose a method that tolerates missing values for fuzzy clustering using resampling (bootstrapping) and cluster stability analysis. The quality of classification is based on the measures like F1 and Hubert. The central idea is to compare a reference cluster with many clusters from sub-samples of the original data set. The results demonstrate that our method is capable of identifying relevant partitions even with high presence of missing values.

MeSH terms

  • Cluster Analysis
  • Computational Biology / methods*
  • Fuzzy Logic*