Online Passive-Aggressive Active Learning for Trapezoidal Data Streams

IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):6725-6739. doi: 10.1109/TNNLS.2022.3178880. Epub 2023 Oct 5.

Abstract

The idea of combining the active query strategy and the passive-aggressive (PA) update strategy in online learning can be credited to the PA active (PAA) algorithm, which has proven to be effective in learning linear classifiers from datasets with a fixed feature space. We propose a novel family of online active learning algorithms, named PAA learning for trapezoidal data streams (PAATS) and multiclass PAATS (MPAATS) (and their variants), for binary and multiclass online classification tasks on trapezoidal data streams where the feature space may expand over time. Under the context of an ever-changing feature space, we provide the theoretical analysis of the mistake bounds for both PAATS and MPAATS. Our experiments on a wide variety of benchmark datasets have confirm that the combination of the instance-regulated active query strategy and the PA update strategy is much more effective in learning from trapezoidal data streams. We have also compared PAATS with online learning with streaming features (OLSF)-the state-of-the-art approach in learning linear classifiers from trapezoidal data streams. PAATS could achieve much better classification accuracy, especially for large-scale real-world data streams.