Generalized Self-Organizing Maps for Automatic Determination of the Number of Clusters and Their Multiprototypes in Cluster Analysis

IEEE Trans Neural Netw Learn Syst. 2018 Jul;29(7):2833-2845. doi: 10.1109/TNNLS.2017.2704779. Epub 2017 Jun 7.

Abstract

This paper presents a generalization of self-organizing maps with 1-D neighborhoods (neuron chains) that can be effectively applied to complex cluster analysis problems. The essence of the generalization consists in introducing mechanisms that allow the neuron chain-during learning-to disconnect into subchains, to reconnect some of the subchains again, and to dynamically regulate the overall number of neurons in the system. These features enable the network-working in a fully unsupervised way (i.e., using unlabeled data without a predefined number of clusters)-to automatically generate collections of multiprototypes that are able to represent a broad range of clusters in data sets. First, the operation of the proposed approach is illustrated on some synthetic data sets. Then, this technique is tested using several real-life, complex, and multidimensional benchmark data sets available from the University of California at Irvine (UCI) Machine Learning repository and the Knowledge Extraction based on Evolutionary Learning data set repository. A sensitivity analysis of our approach to changes in control parameters and a comparative analysis with an alternative approach are also performed.