Discovering Multiple Co-Clusterings With Matrix Factorization

IEEE Trans Cybern. 2021 Jul;51(7):3576-3587. doi: 10.1109/TCYB.2019.2950568. Epub 2021 Jun 23.

Abstract

Clustering is a fundamental data exploration task which aims at discovering the hidden grouping structure in the data. The traditional clustering methods typically compute a single partition. However, there often exist different and equally meaningful clusterings in complex data. To solve this issue, multiple clustering approaches have emerged with the goal of exploring alternative clusterings from different perspectives. Existing solutions to this problem mainly focus on one-way clustering, that is, they cluster either the samples or the features. However, for many practical tasks, it is meaningful and desirable to explore alternative two-way clusterings (or co-clusterings), which capture not only the sample cluster structure but also the feature cluster structure. To tackle this interesting and unresolved task, we introduce an approach, called multiple co-clusterings (MultiCCs), to generate multiple alternative co-clusterings at the same time. MultiCC takes advantage of matrix tri-factorization to seek the co-clustering indicator matrices for samples and features and defines the row and column redundancy quantification terms to enforce diversity among co-clusterings based on these indicator matrices. After that, it integrates matrix tri-factorization and two nonredundancy terms into a unified objective function and gives an alternative optimization procedure to optimize the objective function. Extensive experimental results demonstrate that MultiCC performs significantly better than the existing multiple clustering methods. In addition, MultiCC can find out interesting co-clusters, which cannot be made by those comparing methods.