Medical data set classification using a new feature selection algorithm combined with twin-bounded support vector machine

Med Biol Eng Comput. 2020 Mar;58(3):519-528. doi: 10.1007/s11517-019-02100-z. Epub 2020 Jan 4.

Abstract

Early diagnosis and treatment are the most important strategies to prevent deaths from several diseases. In this regard, data mining and machine learning techniques have been useful tools to help minimize errors and to provide useful information for diagnosis. Our paper aims to present a new feature selection algorithm. In order to validate our study, we used eight benchmark data sets which are commonly used among researchers who developed machine learning methods for medical data classification. The experiment has shown that the performance of our proposed new feature selection method combined with twin-bounded support vector machine (FSTBSVM) is very efficient. The robustness of the FSTBSVM is examined using classification accuracy, analysis of sensitivity, and specificity. The proposed FSTBSVM is a very promising technique for classification, and the results show that the proposed method is capable of producing good results with fewer features than the original data sets. Graphical abstract Model using a new feature selection and grid search with 10-fold CV to optimize model parameters in our FSTBSVM.

Keywords: Classification; Data mining; Feature selection; Medical data set; Twin-bounded support vector machine.

MeSH terms

  • Databases as Topic
  • Female
  • Humans
  • Neural Networks, Computer
  • Support Vector Machine*