Multidimensional support vector machines for visualization of gene expression data

Bioinformatics. 2005 Feb 15;21(4):439-44. doi: 10.1093/bioinformatics/bti188. Epub 2004 Dec 17.

Abstract

Motivation: Since DNA microarray experiments provide us with huge amount of gene expression data, they should be analyzed with statistical methods to extract the meanings of experimental results. Some dimensionality reduction methods such as Principal Component Analysis (PCA) are used to roughly visualize the distribution of high dimensional gene expression data. However, in the case of binary classification of gene expression data, PCA does not utilize class information when choosing axes. Thus clearly separable data in the original space may not be so in the reduced space used in PCA.

Results: For visualization and class prediction of gene expression data, we have developed a new SVM-based method called multidimensional SVMs, that generate multiple orthogonal axes. This method projects high dimensional data into lower dimensional space to exhibit properties of the data clearly and to visualize a distribution of the data roughly. Furthermore, the multiple axes can be used for class prediction. The basic properties of conventional SVMs are retained in our method: solutions of mathematical programming are sparse, and nonlinear classification is implemented implicitly through the use of kernel functions. The application of our method to the experimentally obtained gene expression datasets for patients' samples indicates that our algorithm is efficient and useful for visualization and class prediction.

Contact: komura@hal.rcast.u-tokyo.ac.jp.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Cluster Analysis
  • Computing Methodologies
  • Diagnosis, Computer-Assisted / methods
  • Gene Expression Profiling / methods*
  • Humans
  • Leukemia / diagnosis
  • Leukemia / genetics
  • Leukemia / metabolism
  • Lung Neoplasms / diagnosis
  • Lung Neoplasms / genetics
  • Lung Neoplasms / metabolism
  • Neoplasm Proteins / metabolism
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated / methods*
  • User-Computer Interface*

Substances

  • Neoplasm Proteins