Selection Heuristics on Semantic Genetic Programming for Classification Problems

Evol Comput. 2022 Jun 1;30(2):253-289. doi: 10.1162/evco_a_00297.

Abstract

Individual semantics have been used for guiding the learning process of Genetic Programming. Novel genetic operators and different ways of performing parent selection have been proposed with the use of semantics. The latter is the focus of this contribution by proposing three heuristics for parent selection that measure the similarity among individuals' semantics for choosing parents that enhance the addition, Naive Bayes, and Nearest Centroid. To the best of our knowledge, this is the first time that functions' properties are used for guiding the learning process. As the heuristics were created based on the properties of these functions, we apply them only when they are used to create offspring. The similarity functions considered are the cosine similarity, Pearson's correlation, and agreement. We analyze these heuristics' performance against random selection, state-of-the-art selection schemes, and 18 classifiers, including auto-machine-learning techniques, on 30 classification problems with a variable number of samples, variables, and classes. The result indicated that the combination of parent selection based on agreement and random selection to replace an individual in the population produces statistically better results than the classical selection and state-of-the-art schemes, and it is competitive with state-of-the-art classifiers. Finally, the code is released as open-source software.

Keywords: Genetic programming; classification; functions' properties; parent selection; semantics.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Heuristics*
  • Humans
  • Machine Learning
  • Semantics*