Local-Nearest-Neighbors-Based Feature Weighting for Gene Selection

IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1538-1548. doi: 10.1109/TCBB.2017.2712775. Epub 2017 Jun 7.

Abstract

Selecting functional genes is essential for analyzing microarray data. Among many available feature (gene) selection approaches, the ones on the basis of the large margin nearest neighbor receive more attention due to their low computational costs and high accuracies in analyzing the high-dimensional data. Yet, there still exist some problems that hamper the existing approaches in sifting real target genes, including selecting erroneous nearest neighbors, high sensitivity to irrelevant genes, and inappropriate evaluation criteria. Previous pioneer works have partly addressed some of the problems, but none of them are capable of solving these problems simultaneously. In this paper, we propose a new local-nearest-neighbors-based feature weighting approach to alleviate the above problems. The proposed approach is based on the trick of locally minimizing the within-class distances and maximizing the between-class distances with the nearest neighbors rule. We further define a feature weight vector, and construct it by minimizing the cost function with a regularization term. The proposed approach can be applied naturally to the multi-class problems and does not require extra modification. Experimental results on the UCI and the open microarray data sets validate the effectiveness and efficiency of the new approach.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Gene Expression Profiling
  • Genes / genetics*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated
  • Sequence Analysis, DNA