Tailored aggregation for classification

Tristan Mary-Huard; Stéphane Robin

doi:10.1109/TPAMI.2009.55

Tailored aggregation for classification

IEEE Trans Pattern Anal Mach Intell. 2009 Nov;31(11):2098-105. doi: 10.1109/TPAMI.2009.55.

Authors

Tristan Mary-Huard¹, Stéphane Robin

Affiliation

¹ UMR AgroParisTech/INRIA, Paris Cedex 05, France. maryhuar@agroparistech.fr

PMID: 19762936
DOI: 10.1109/TPAMI.2009.55

Abstract

Compression and variable selection are two classical strategies to deal with large-dimension data sets in classification. We propose an alternative strategy, called aggregation, which consists of a clustering step of redundant variables and a compression step within each group. We develop a statistical framework to define tailored aggregation methods that can be combined with selection methods to build reliable classifiers that benefit from the information contained in redundant variables. Two algorithms are proposed for ordered and nonordered variables, respectively. Applications to the kNN and CART algorithms are presented.

MeSH terms

Algorithms*
Artificial Intelligence*
Cluster Analysis*
Computer Simulation
Decision Support Techniques*
Models, Theoretical*
Pattern Recognition, Automated / methods*