Generative Mixup Networks for Zero-Shot Learning

IEEE Trans Neural Netw Learn Syst. 2022 Jan 28:PP. doi: 10.1109/TNNLS.2022.3142181. Online ahead of print.

Abstract

Zero-shot learning casts light on lacking unseen class data by transferring knowledge from seen classes via a joint semantic space. However, the distributions of samples from seen and unseen classes are usually imbalanced. Many zero-shot learning methods fail to obtain satisfactory results in the generalized zero-shot learning task, where seen and unseen classes are all used for the test. Also, irregular structures of some classes may result in inappropriate mapping from visual features space to semantic attribute space. A novel generative mixup networks with semantic graph alignment is proposed in this article to mitigate such problems. To be specific, our model first attempts to synthesize samples conditioned with class-level semantic information as the prototype to recover the class-based feature distribution from the given semantic description. Second, the proposed model explores a mixup mechanism to augment training samples and improve the generalization ability of the model. Third, triplet gradient matching loss is developed to guarantee the class invariance to be more continuous in the latent space, and it can help the discriminator distinguish the real and fake samples. Finally, a similarity graph is constructed from semantic attributes to capture the intrinsic correlations and guides the feature generation process. Extensive experiments conducted on several zero-shot learning benchmarks from different tasks prove that the proposed model can achieve superior performance over the state-of-the-art generalized zero-shot learning.