Learnable Subspace Clustering

IEEE Trans Neural Netw Learn Syst. 2022 Mar;33(3):1119-1133. doi: 10.1109/TNNLS.2020.3040379. Epub 2022 Feb 28.

Abstract

This article studies the large-scale subspace clustering (LS2C) problem with millions of data points. Many popular subspace clustering methods cannot directly handle the LS2C problem although they have been considered to be state-of-the-art methods for small-scale data points. A simple reason is that these methods often choose all data points as a large dictionary to build huge coding models, which results in high time and space complexity. In this article, we develop a learnable subspace clustering paradigm to efficiently solve the LS2C problem. The key concept is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces instead of the computationally demanding classical coding models. Moreover, we propose a unified, robust, predictive coding machine (RPCM) to learn the parametric function, which can be solved by an alternating minimization algorithm. Besides, we provide a bounded contraction analysis of the parametric function. To the best of our knowledge, this article is the first work to efficiently cluster millions of data points among the subspace clustering methods. Experiments on million-scale data sets verify that our paradigm outperforms the related state-of-the-art methods in both efficiency and effectiveness.