Characteristic gene selection via weighting principal components by singular values

PLoS One. 2012;7(7):e38873. doi: 10.1371/journal.pone.0038873. Epub 2012 Jul 10.

Abstract

Conventional gene selection methods based on principal component analysis (PCA) use only the first principal component (PC) of PCA or sparse PCA to select characteristic genes. These methods indeed assume that the first PC plays a dominant role in gene selection. However, in a number of cases this assumption is not satisfied, so the conventional PCA-based methods usually provide poor selection results. In order to improve the performance of the PCA-based gene selection method, we put forward the gene selection method via weighting PCs by singular values (WPCS). Because different PCs have different importance, the singular values are exploited as the weights to represent the influence on gene selection of different PCs. The ROC curves and AUC statistics on artificial data show that our method outperforms the state-of-the-art methods. Moreover, experimental results on real gene expression data sets show that our method can extract more characteristic genes in response to abiotic stresses than conventional gene selection methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Arabidopsis / genetics*
  • Arabidopsis / radiation effects
  • Arabidopsis Proteins / genetics*
  • Area Under Curve
  • Cold Temperature
  • Droughts
  • Gene Expression Profiling
  • Gene Expression Regulation, Plant* / radiation effects
  • Plant Roots / genetics*
  • Plant Roots / radiation effects
  • Plant Shoots / genetics*
  • Plant Shoots / radiation effects
  • Principal Component Analysis / methods*
  • ROC Curve
  • Salinity
  • Stress, Physiological
  • Ultraviolet Rays

Substances

  • Arabidopsis Proteins