Integration of multi-objective PSO based feature selection and node centrality for medical datasets

Genomics. 2020 Nov;112(6):4370-4384. doi: 10.1016/j.ygeno.2020.07.027. Epub 2020 Jul 25.

Abstract

In the past decades, the rapid growth of computer and database technologies has led to the rapid growth of large-scale medical datasets. On the other, medical applications with high dimensional datasets that require high speed and accuracy are rapidly increasing. One of the dimensionality reduction approaches is feature selection that can increase the accuracy of the disease diagnosis and reduce its computational complexity. In this paper, a novel PSO-based multi objective feature selection method is proposed. The proposed method consists of three main phases. In the first phase, the original features are showed as a graph representation model. In the next phase, feature centralities for all nodes in the graph are calculated, and finally, in the third phase, an improved PSO-based search process is utilized to final feature selection. The results on five medical datasets indicate that the proposed method improves previous related methods in terms of efficiency and effectiveness.

Keywords: Data mining; Feature selection; Medical diagnosis; Multi-objective; Particle swarm optimization.

MeSH terms

  • Algorithms*
  • Data Mining
  • Datasets as Topic
  • Diagnosis*
  • Humans
  • Neoplasms / diagnosis