A New Automatic Hyperparameter Recommendation Approach Under Low-Rank Tensor Completion e Framework

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4038-4050. doi: 10.1109/TPAMI.2022.3195658. Epub 2023 Mar 7.

Abstract

Hyperparameter optimization (HPO), characterized by hyperparameter tuning, is not only a critical step for effective modeling but also is the most time-consuming process in machine learning. Traditional search-based algorithms tend to require extensive configuration evaluations for each round to select the desirable hyperparameters during the process, and they are often very inefficient for the implementations on large-scale tasks. In this paper, we study the HPO problem via meta-learning (MtL) approach under the low-rank tensor completion (LRTC) framework. Our proposed approach predicts the performance for hyperparameters of new problems based on their previous performance so that the underlying suitable hyperparameters with better efficiency can be attained. Different from existing approaches, the hyperparameter performance space is instantiated under tensor framework that can preserve the spatial structure and reflect the correlations among the adjacent hyperparameters. When some partial evaluations are available for a new problem, the task of estimating the performance of the unevaluated hyperparameters can be formulated as a tensor completion (TC) problem. Toward the completion purpose, we develop an LRTC algorithm utilizing the sum of nuclear norm (SNN) model. A kernelized version is further developed to capture the nonlinear structure of the performance space. In addition, a corresponding coupled matrix factorization (CMF) algorithm is established to render the predictions solely depend on the meta-features to avoid additional hyperparameter evaluations. Finally, a strategy integrating LRTC and CMF is provided to further enhance the recommendation capacity. We test recommendation performance with our proposed methods for classical SVM and the state-of-the-art deep neural networks such as vision transformer (ViT) and residual network (ResNet), and the obtained results demonstrate the effectiveness of our approaches under various evaluation metrics by comparing with the baselines commonly used for MtL.