Minimum number of genes for microarray feature selection

Elena Baralis; Giulia Bruno; Alessandro Fiori

doi:10.1109/IEMBS.2008.4650506

Minimum number of genes for microarray feature selection

Annu Int Conf IEEE Eng Med Biol Soc. 2008:2008:5692-5. doi: 10.1109/IEMBS.2008.4650506.

Authors

Elena Baralis¹, Giulia Bruno, Alessandro Fiori

Affiliation

¹ Politecnico di Torino, Italy.

PMID: 19164009
DOI: 10.1109/IEMBS.2008.4650506

Abstract

A fundamental problem in microarray analysis is to identify relevant genes from large amounts of expression data. Feature selection aims at identifying a subset of features for building robust learning models. However, finding the optimal number of features is a challenging problem, as it is a trade off between information loss when pruning excessively and noise increase when pruning is too weak. This paper presents a novel representation of genes as strings of bits and a method which automatically selects the minimum number of genes to reach a good classification accuracy on the training set. Our method first eliminates redundant features, which do not add further information for classification, then it exploits a set covering algorithm. Preliminary experimental results on public datasets confirm the intuition of the proposed method leading to high classification accuracy.

Publication types

Evaluation Study

MeSH terms

Algorithms
Artificial Intelligence*
Diagnosis, Computer-Assisted / methods*
Gene Expression Profiling / methods*
Humans
Oligonucleotide Array Sequence Analysis / methods*
Pattern Recognition, Automated / methods*
Reproducibility of Results
Sample Size
Sensitivity and Specificity
Signal Processing, Computer-Assisted