[Research on pattern classification methods using gene expression data]

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2005 Jun;22(3):505-9.
[Article in Chinese]

Abstract

One of the applications of cDNA microarrays is to recognize the class and subclass of diseases such as cancers on the basis of statistical pattern classification methods using gene expression data. In this paper, we apply 2000 genes expression dataset provided by Affymatrix Company: 40 samples of intestine cancer tissue and 22 samples of normal tissue. We compare the performance of four pattern classification methods based on different feature selection methods. These pattern classification methods include: Fisher linear discriminate, Logit nonlinear discriminate, the least distance and K-nearest neighbor classifier. The results show firstly that four pattern classifiers based on the feature selection methods of t-test and classification tree all have better performance than those based on the stochastic feature selection methods, secondly that K-nearest neighbor classifier has the best performance, thirdly that both the least distance classifier and K-nearest neighbor classifier have better generalization, fourthly that four classifiers are less sensitive to the composition of samples.

Publication types

  • Comparative Study

MeSH terms

  • Gene Expression
  • Humans
  • Intestinal Neoplasms / classification*
  • Intestinal Neoplasms / genetics
  • Oligonucleotide Array Sequence Analysis*
  • Pattern Recognition, Automated / methods*