Generalization Performance of Pure Accuracy and its Application in Selective Ensemble Learning

IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):1798-1816. doi: 10.1109/TPAMI.2022.3171436. Epub 2023 Jan 6.

Abstract

The pure accuracy measure is used to eliminate random consistency from the accuracy measure. Biases to both majority and minority classes in the pure accuracy are lower than that in the accuracy measure. In this paper, we demonstrate that compared with the accuracy measure and F-measure, the pure accuracy measure is class distribution insensitive and discriminative for good classifiers. The advantages make the pure accuracy measure suitable for traditional classification. Further, we mainly focus on two points: exploring a tighter generalization bound on pure accuracy based learning paradigm and designing a learning algorithm based on the pure accuracy measure. Particularly, with the self-bounding property, we build an algorithm-independent generalization bound on the pure accuracy measure, which is tighter than the existing bound of an order O(1/√N) (N is the number of instances). The proposed bound is free from making a smoothness or convex assumption on the hypothesis functions. In addition, we design a learning algorithm optimizing the pure accuracy measure and use it in the selective ensemble learning setting. The experiments on sixteen benchmark data sets and four image data sets demonstrate that the proposed method statistically performs better than the other eight representative benchmark algorithms.