EHHR: an efficient evolutionary hyper-heuristic based recommender framework for short-text classifier selection

Cluster Comput. 2023;26(2):1425-1446. doi: 10.1007/s10586-022-03754-5. Epub 2022 Oct 10.

Abstract

With various machine learning heuristics, it becomes difficult to choose an appropriate heuristic to classify short-text emerging from various social media sources in the form of tweets and reviews. The No Free Lunch theorem asserts that no heuristic applies to all problems indiscriminately. Regardless of their success, the available classifier recommendation algorithms only deal with numeric data. To cater to these limitations, an umbrella classifier recommender must determine the best heuristic for short-text data. This paper presents an efficient reminisce-enabled classifier recommender framework to recommend a heuristic for new short-text data classification. The proposed framework, "Efficient Evolutionary Hyper-heuristic based Recommender Framework for Short-text Classifier Selection (EHHR)," reuses the previous solutions to predict the performance of various heuristics for an unseen problem. The Hybrid Adaptive Genetic Algorithm (HAGA) in EHHR facilitates dataset-level feature optimization and performance prediction. HAGA reveals that the influential features for recommending the best short-text heuristic are the average entropy, mean length of the word string, adjective variation, verb variation II, and average hard examples. The experimental results show that HAGA is 80% more accurate when compared to the standard Genetic Algorithm (GA). Additionally, EHHR clusters datasets and rank heuristics cluster-wise. EHHR clusters 9 out of 10 problems correctly.

Keywords: Evolutionary algorithm; Hyper-heuristics; Machine learning; Short-text classification; Social media.