Feature selection algorithm based on optimized genetic algorithm and the application in high-dimensional data processing

Guilian Feng

doi:10.1371/journal.pone.0303088

Feature selection algorithm based on optimized genetic algorithm and the application in high-dimensional data processing

PLoS One. 2024 May 9;19(5):e0303088. doi: 10.1371/journal.pone.0303088. eCollection 2024.

Author

Guilian Feng¹

Affiliation

¹ School of Physics and Electronic Information Engineering, Qinghai Minzu University, Xining, China.

Abstract

High-dimensional data is widely used in many fields, but selecting key features from it is challenging. Feature selection can reduce data dimensionality and weaken noise interference, thereby improving model efficiency and enhancing model interpretability. In order to improve the efficiency and accuracy of high-dimensional data processing, a feature selection method based on optimized genetic algorithm is proposed in this study. The algorithm simulates the process of natural selection, searches for possible subsets of feature, and finds the subsets of feature that optimizes the performance of the model. The results show that when the value of K is less than 4 or more than 8, the recognition rate is very low. After adaptive bias filtering, 724 features are filtered to 372, and the accuracy is improved from 0.9352 to 0.9815. From 714 features to 406 Gaussian codes, the accuracy is improved from 0.9625 to 0.9754. Among all tests, the colon has the highest average accuracy, followed by small round blue cell tumor(SRBCT), lymphoma, central nervous system(CNS) and ovaries. The green curve is the best, with stable performance and a time range of 0-300. While maintaining the efficiency, it can reach 4.48 as soon as possible. The feature selection method has practical significance for high-dimensional data processing, improves the efficiency and accuracy of data processing, and provides an effective new method for high-dimensional data processing.

Copyright: © 2024 Guilian Feng. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Humans

Grants and funding

The author(s) received no specific funding for this work.