Tuning parameter estimation in SCAD-support vector machine using firefly algorithm with application in gene selection and cancer classification

Comput Biol Med. 2018 Dec 1:103:262-268. doi: 10.1016/j.compbiomed.2018.10.034. Epub 2018 Oct 31.

Abstract

In cancer classification, gene selection is one of the most important bioinformatics related topics. The selection of genes can be considered to be a variable selection problem, which aims to find a small subset of genes that has the most discriminative information for the classification target. The penalized support vector machine (PSVM) has proved its effectiveness at creating a strong classifier that combines the advantages of the support vector machine and penalization. PSVM with a smoothly clipped absolute deviation (SCAD) penalty is the most widely used method. However, the efficiency of PSVM with SCAD depends on choosing the appropriate tuning parameter involved in the SCAD penalty. In this paper, a firefly algorithm, which is a metaheuristic continuous algorithm, is proposed to determine the tuning parameter in PSVM with SCAD penalty. Our proposed algorithm can efficiently help to find the most relevant genes with high classification performance. The experimental results from four benchmark gene expression datasets show the superior performance of the proposed algorithm in terms of classification accuracy and the number of selected genes compared with competing methods.

Keywords: Cancer classification; Firefly algorithm; Gene selection; Penalized support vector machine; SCAD.

MeSH terms

  • Computational Biology
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Humans
  • Neoplasms / classification*
  • Neoplasms / genetics*
  • Neoplasms / metabolism
  • Support Vector Machine*