Nature-Inspired Multiobjective Cancer Subtype Diagnosis

IEEE J Transl Eng Health Med. 2019 Mar 7:7:4300112. doi: 10.1109/JTEHM.2019.2891746. eCollection 2019.

Abstract

Cancer gene expression data is of great importance in cancer subtype diagnosis and drug discovery. Many computational methods have been proposed to classify subtypes using those data. However, most of the previous computational methods suffer from poor interpretability, experimental noises, and low diagnostic quality. To address those problems, multiobjective ensemble cuckoo search based on decomposition (MOECSA) is proposed to optimize those four objectives simultaneously including the number of features, the accuracy, and two entropy-based measures: the relevance and the redundancy, classifying the cancer gene expression data with high predictive power for different cardinality levels under multiple objectives. A novel binary encoding is proposed to choose gene subsets from the cancer gene expression data for calculating four objective functions. Furthermore, an effective ensemble mechanism blended in the cuckoo search algorithm framework is applied to balance the convergence speed and population diversity in MOECSA. To demonstrate the effectiveness and efficiency of the proposed algorithm, experiments on thirty-five benchmark cancer gene expression datasets, four independent disease datasets, and one sequencing-based dataset are carried out to compare MOECSA with the six state-of-the-art multiobjective evolutionary algorithms and seven traditional classification algorithms. The experimental results in different perspectives demonstrate that MOECSA has better diagnosis performance than others at multiple levels.

Keywords: Classification; cancer subtype diagnosis; feature selection; multiobjective optimization.

Grants and funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61603087, the Natural Science Foundation of Jilin Province under Grant 20190103006JH, the Science and Technology Development Planning of Jilin Province under Grant 20160204043GX, the Fundamental Research Funds for the Central Universities under Grant 2412017FZ026, and the Chongqing High-Performance Computing Platform: 991 cstc2015ptfw-ggfw120002.