A Novel Hybrid Feature Selection Model for Classification of Neuromuscular Dystrophies Using Bhattacharyya Coefficient, Genetic Algorithm and Radial Basis Function Based Support Vector Machine

Interdiscip Sci. 2018 Jun;10(2):244-250. doi: 10.1007/s12539-016-0183-6. Epub 2016 Sep 17.

Abstract

An accurate classification of neuromuscular disorders is important in providing proper treatment facilities to the patients. Recently, the microarray technology is employed to monitor the level of activity or expression of large number of genes simultaneously. The gene expression data derived from the microarray experiment usually involve a large number of genes but a very few number of samples. There is a need to reduce the dimension of gene expression data which intends to find a small set of discriminative genes that accurately classifies the samples of various kinds of diseases. So, our goal is to find a small subset of genes which ensures the accurate classification of neuromuscular disorders. In the present paper, we propose a novel hybrid feature selection model for classification of neuromuscular disorders. The process of feature selection is done in two phases by integrating Bhattacharyya coefficient and genetic algorithm (GA). In the first phase, we find Bhattacharyya coefficient to choose a candidate gene subset by removing the most redundant genes. In the second phase, the target gene subset is created by selecting the most discriminative gene subset by applying GA wherein the fitness function is calculated using radial basis function support vector machine (RBF SVM). The proposed hybrid algorithm is applied on two publicly available microarray neuromuscular disorders datasets. The results are compared with two individual techniques of feature selection, namely Bhattacharyya coefficient and GA, and one integrated technique, i.e., Bhattacharyya-GA wherein the fitness function of GA is calculated using four other classifiers, which shows that the proposed integrated method is capable of giving the better classification accuracy.

Keywords: Bhattacharyya coefficient; Genetic algorithm; Microarray data; Neuromuscular disorders; Radial basis function; Support vector machine.

MeSH terms

  • Humans
  • Models, Theoretical*
  • Muscular Dystrophies / classification*
  • Muscular Dystrophies / genetics
  • Support Vector Machine*