No Adversaries to Zero-Shot Learning: Distilling an Ensemble of Gaussian Feature Generators

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12167-12178. doi: 10.1109/TPAMI.2023.3282971. Epub 2023 Sep 5.

Abstract

In zero-shot learning (ZSL), the task of recognizing unseen categories when no data for training is available, state-of-the-art methods generate visual features from semantic auxiliary information (e.g., attributes). In this work, we propose a valid alternative (simpler, yet better scoring) to fulfill the very same task. We observe that, if first- and second-order statistics of the classes to be recognized were known, sampling from Gaussian distributions would synthesize visual features that are almost identical to the real ones as per classification purposes. We propose a novel mathematical framework to estimate first- and second-order statistics, even for unseen classes: our framework builds upon prior compatibility functions for ZSL and does not require additional training. Endowed with such statistics, we take advantage of a pool of class-specific Gaussian distributions to solve the feature generation stage through sampling. We exploit an ensemble mechanism to aggregate a pool of softmax classifiers, each trained in a one-seen-class-out fashion to better balance the performance over seen and unseen classes. Neural distillation is finally applied to fuse the ensemble into a single architecture which can perform inference through one forward pass only. Our method, termed Distilled Ensemble of Gaussian Generators, scores favorably with respect to state-of-the-art works.