An interactive approach to multiobjective clustering of gene expression patterns

IEEE Trans Biomed Eng. 2013 Jan;60(1):35-41. doi: 10.1109/TBME.2012.2220765. Epub 2012 Sep 28.

Abstract

Some recent studies have posed the problem of data clustering as a multiobjective optimization problem, where several cluster validity indices are simultaneously optimized to obtain tradeoff clustering solutions. A number of cluster validity index measures are available in the literature. However, none of the measures can perform equally well in all kinds of datasets. Depending on the dataset properties and its inherent clustering structure, different cluster validity measures perform differently. Therefore, it is important to find the best set of validity indices that should be optimized simultaneously to obtain good clustering results. In this paper, a novel interactive genetic algorithm-based multiobjective approach is proposed that simultaneously finds the clustering solution as well as evolves the set of validity measures that are to be optimized simultaneously. The proposed method interactively takes the input from the human decision maker (DM) during execution and adaptively learns from that input to obtain the final set of validity measures along with the final clustering result. The algorithm is applied for clustering real-life benchmark gene expression datasets and its performance is compared with that of several other existing clustering algorithms to demonstrate its effectiveness. The results indicate that the proposed method outperforms the other existing algorithms for all the datasets considered here.

MeSH terms

  • Algorithms*
  • Cluster Analysis*
  • Computational Biology / methods*
  • Databases, Genetic
  • Fibroblasts / metabolism
  • Fuzzy Logic
  • Gene Expression Profiling / methods*
  • Humans
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis
  • Pattern Recognition, Automated / methods*
  • Yeasts / genetics