Multiple Kernel k-Means Clustering by Selecting Representative Kernels

IEEE Trans Neural Netw Learn Syst. 2021 Nov;32(11):4983-4996. doi: 10.1109/TNNLS.2020.3026532. Epub 2021 Oct 27.

Abstract

To cluster data that are not linearly separable in the original feature space, k -means clustering was extended to the kernel version. However, the performance of kernel k -means clustering largely depends on the choice of the kernel function. To mitigate this problem, multiple kernel learning has been introduced into the k -means clustering to obtain an optimal kernel combination for clustering. Despite the success of multiple kernel k -means clustering in various scenarios, few of the existing work update the combination coefficients based on the diversity of kernels, which leads to the result that the selected kernels contain high redundancy and would degrade the clustering performance and efficiency. We resolve this problem from the perspective of subset selection in this article. In particular, we first propose an effective strategy to select a diverse subset from the prespecified kernels as the representative kernels, and then incorporate the subset selection process into the framework of multiple k -means clustering. The representative kernels can be indicated as a significant combination weights. Due to the nonconvexity of the obtained objective function, we develop an alternating minimization method to optimize the combination coefficients of the selected kernels and the cluster membership alternatively. In particular, an efficient optimization method is developed to reduce the time complexity of optimizing the kernel combination weights. Finally, extensive experiments on benchmark and real-world data sets demonstrate the effectiveness and superiority of our approach in comparison with existing methods.