Nonparametric Estimation of Probabilistic Membership for Subspace Clustering

IEEE Trans Cybern. 2020 Mar;50(3):1023-1036. doi: 10.1109/TCYB.2018.2878069. Epub 2018 Nov 8.

Abstract

Recent advances of subspace clustering have provided a new way of constructing affinity matrices for clustering. Unlike the kernel-based subspace clustering, which needs tedious tuning among infinitely many kernel candidates, the self-expressive models derived from linear subspace assumptions in modern subspace clustering methods are rigorously combined with sparse or low-rank optimization theory to yield an affinity matrix as a solution of an optimization problem. Despite this nice theoretical aspect, the affinity matrices of modern subspace clustering have quite different meanings from the traditional ones, and even though the affinity matrices are expected to have a rough block-diagonal structure, it is unclear whether these are good enough to apply spectral clustering. In fact, most of the subspace clustering methods perform some sort of affinity value rearrangement to apply spectral clustering, but its adequacy for the spectral clustering is not clear; even though the spectral clustering step can also have a critical impact on the overall performance. To resolve this issue, in this paper, we provide a nonparametric method to estimate the probabilistic cluster membership from these affinity matrices, which we can directly find the clusters from. The likelihood for an affinity matrix is defined nonparametrically based on histograms given the probabilistic membership, which is defined as a combination of probability simplices, and an additional prior probability is defined to regularize the membership as a Bernoulli distribution. Solving this maximum a posteriori problem replaces the spectral clustering procedure, and the final discrete cluster membership can be calculated by selecting the clusters with maximum probabilities. The proposed method provides state-of-the-art performance for well-known subspace clustering methods on popular benchmark databases.