Fast Multilabel Feature Selection via Global Relevance and Redundancy Optimization

IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5721-5734. doi: 10.1109/TNNLS.2022.3208956. Epub 2024 Apr 4.

Abstract

Information theoretical-based methods have attracted a great attention in recent years and gained promising results for multilabel feature selection (MLFS). Nevertheless, most of the existing methods consider a heuristic way to the grid search of important features, and they may also suffer from the issue of fully utilizing labeling information. Thus, they are probable to deliver a suboptimal result with heavy computational burden. In this article, we propose a general optimization framework global relevance and redundancy optimization (GRRO) to solve the learning problem. The main technical contribution in GRRO is a formulation for MLFS while feature relevance, label relevance (i.e., label correlation), and feature redundancy are taken into account, which can avoid repetitive entropy calculations to obtain a global optimal solution efficiently. To further improve the efficiency, we extend GRRO to filter out inessential labels and features, thus facilitating fast MLFS. We call the extension as GRROfast, in which the key insights are twofold: 1) promising labels and related relevant features are investigated to reduce ineffective calculations in terms of features, even labels and 2) the framework of GRRO is reconstructed to generate the optimal result with an ensemble. Moreover, our proposed algorithms have an excellent mechanism for exploiting the inherent properties of multilabel data; specifically, we provide a formulation to enhance the proposal with label-specific features. Extensive experiments clearly reveal the effectiveness and efficiency of our proposed algorithms.