Identification of informative genes and pathways using an improved penalized support vector machine with a weighting scheme

Comput Biol Med. 2016 Oct 1:77:102-15. doi: 10.1016/j.compbiomed.2016.08.004. Epub 2016 Aug 4.

Abstract

Incorporation of pathway knowledge into microarray analysis has brought better biological interpretation of the analysis outcome. However, most pathway data are manually curated without specific biological context. Non-informative genes could be included when the pathway data is used for analysis of context specific data like cancer microarray data. Therefore, efficient identification of informative genes is inevitable. Embedded methods like penalized classifiers have been used for microarray analysis due to their embedded gene selection. This paper proposes an improved penalized support vector machine with absolute t-test weighting scheme to identify informative genes and pathways. Experiments are done on four microarray data sets. The results are compared with previous methods using 10-fold cross validation in terms of accuracy, sensitivity, specificity and F-score. Our method shows consistent improvement over the previous methods and biological validation has been done to elucidate the relation of the selected genes and pathway with the phenotype under study.

Keywords: Artificial intelligence; Bioinformatics; Informative genes; Pathway-based microarray analysis; Penalized support vector machine; Penalty function; Weighting scheme.

MeSH terms

  • Animals
  • Apoptosis / genetics
  • Cell Cycle / genetics
  • Computational Biology / methods*
  • Gene Expression Profiling
  • Gene Regulatory Networks / genetics*
  • Humans
  • Mice
  • Microarray Analysis
  • Neoplasms / genetics
  • Neoplasms / metabolism
  • Support Vector Machine*
  • Transcriptome / genetics*