Feature gene selection method based on logistic and correlation information entropy

Biomed Mater Eng. 2015:26 Suppl 1:S1953-9. doi: 10.3233/BME-151498.

Abstract

In view of the characteristics of high dimension, small samples, nonlinearity and numeric type in the gene expression profile data, the logistic and the correlation information entropy are introduced into the feature gene selection. At first, the gene variable is screened preliminarily by logistic regression to obtain the genes that have a greater impact on the classification; then, the candidate features set is generated by deleting the unrelated features using Relief algorithm. On the basis of this, delete redundant features by using the correlation information entropy; finally, the feature gene subset is classified by using the classifier of support vector machine (SVM). Experimental results show that the proposed method can obtain smaller subset of genes and achieve higher recognition rate.

Keywords: Gene chips; correlation information entropy; feature selection; logistic.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Entropy
  • Gene Expression Profiling / methods*
  • Humans
  • Linear Models*
  • Neoplasm Proteins / metabolism*
  • Neoplasms / metabolism*
  • Pattern Recognition, Automated / methods*
  • Regression Analysis
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Statistics as Topic

Substances

  • Neoplasm Proteins