Multilabel Feature Selection via Shared Latent Sublabel Structure and Simultaneous Orthogonal Basis Clustering

IEEE Trans Neural Netw Learn Syst. 2024 Apr 24:PP. doi: 10.1109/TNNLS.2024.3382911. Online ahead of print.

Abstract

Multilabel feature selection solves the dimension distress of high-dimensional multilabel data by selecting the optimal subset of features. Noisy and incomplete labels of raw multilabel data hinder the acquisition of label-guided information. In existing approaches, mapping the label space to a low-dimensional latent space by semantic decomposition to mitigate label noise is considered an effective strategy. However, the decomposed latent label space contains redundant label information, which misleads the capture of potential label relevance. To eliminate the effect of redundant information on the extraction of latent label correlations, a novel method named SLOFS via shared latent sublabel structure and simultaneous orthogonal basis clustering for multilabel feature selection is proposed. First, a latent orthogonal base structure shared (LOBSS) term is engineered to guide the construction of a redundancy-free latent sublabel space via the separated latent clustering center structure. The LOBSS term simultaneously retains latent sublabel information and latent clustering center structure. Moreover, the structure and relevance information of nonredundant latent sublabels are fully explored. The introduction of graph regularization ensures structural consistency in the data space and latent sublabels, thus helping the feature selection process. SLOFS employs a dynamic sublabel graph to obtain a high-quality sublabel space and uses regularization to constrain label correlations on dynamic sublabel projections. Finally, an effective convergence provable optimization scheme is proposed to solve the SLOFS method. The experimental studies on the 18 datasets demonstrate that the presented method performs consistently better than previous feature selection methods.