Cuckoo Search-Based Optimization for Cancer Classification: A New Hybrid Approach

J Comput Biol. 2022 Jun;29(6):565-584. doi: 10.1089/cmb.2021.0410. Epub 2022 May 6.

Abstract

The design of an optimal framework for the prediction of cancer from high-dimensional and imbalanced microarray data is a challenging job in the fields of bioinformatics and machine learning. There are so many techniques for dimensionality reduction, but it is unclear which of these techniques performs best with different classifiers and datasets. This article focused on the independent component analysis (ICA) features (genes) extraction method for Naïve Bayes (NB) classification of microarray data, because ICA perfectly takes out an independent component from the datasets that satisfy the classification criteria of the NB classifier. A novel hybrid method based on a nature-inspired metaheuristic algorithm is proposed in this article for resolving optimization problems of ICA extracted genes. The cuckoo search (CS) algorithm and artificial bee colony (ABC) for finding the best subset of features to increase the performance of ICA for the NB classifier is designed and executed. According to our investigation, the CS-ABC with ICA was implemented for the first time to resolve the dimensionality reduction problem in high-dimensional microarray biomedical datasets. The CS algorithm improved the local search process of the ABC algorithm, and then the hybrid algorithm CS-ABC provided better optimal gene sets that improved the classification accuracy of the NB classifier. The experimental comparison shows that the CS-ABC approach with the ICA algorithm performs a deeper search in the iterative process, which can avoid premature convergence and produce better results compared with the previously published feature selection algorithm for the NB classifier.

Keywords: Naïve Bayes (NB); cuckoo search (CS); independent component analysis (ICA); machine learning; soft computing.

MeSH terms

  • Algorithms*
  • Animals
  • Bayes Theorem
  • Computational Biology
  • Machine Learning
  • Neoplasms* / genetics