Metafeature Selection via Multivariate Sparse-Group Lasso Learning for Automatic Hyperparameter Configuration Recommendation

Liping Deng; Wen-Sheng Chen; Mingqing Xiao

doi:10.1109/TNNLS.2023.3263506

Metafeature Selection via Multivariate Sparse-Group Lasso Learning for Automatic Hyperparameter Configuration Recommendation

IEEE Trans Neural Netw Learn Syst. 2023 Apr 10:PP. doi: 10.1109/TNNLS.2023.3263506. Online ahead of print.

Authors

Liping Deng, Wen-Sheng Chen, Mingqing Xiao

PMID: 37037247
DOI: 10.1109/TNNLS.2023.3263506

Abstract

The performance of classification algorithms is mainly governed by the hyperparameter settings deployed in applications, and the search for desirable hyperparameter configurations usually is quite challenging due to the complexity of datasets. Metafeatures are a group of measures that characterize the underlying dataset from various aspects, and the corresponding recommendation algorithm fully relies on the appropriate selection of metafeatures. Metalearning (MtL), aiming to improve the learning algorithm itself, requires development in integrating features, models, and algorithm learning to accomplish its goal. In this article, we develop a multivariate sparse-group Lasso (SGLasso) model embedded with MtL capacity in recommending suitable configurations via learning. The main idea is to select the principal metafeatures by removing those redundant or irregular ones, promoting both efficiency and performance in the hyperparameter configuration recommendation. To be specific, we first extract the metafeatures and classification performance of a set of configurations from the collection of historical datasets, and then, a metaregression task is established through SGLasso to capture the main characteristics of the underlying relationship between metafeatures and historical performance. For a new dataset, the classification performance of configurations can be estimated through the selected metafeatures so that the configuration with the highest predictive performance in terms of the new dataset can be generated. Furthermore, a general MtL architecture combined with our model is developed. Extensive experiments are conducted on 136 UCI datasets, demonstrating the effectiveness of the proposed approach. The empirical results on the well-known SVM show that our model can effectively recommend suitable configurations and outperform the existing MtL-based methods and the well-known search-based algorithms, such as random search, Bayesian optimization, and Hyperband.