Leveraging Balanced Semantic Embedding for Generative Zero-Shot Learning

Guo-Sen Xie; Xu-Yao Zhang; Tian-Zhu Xiang; Fang Zhao; Zheng Zhang; Ling Shao; Xuelong Li

doi:10.1109/TNNLS.2022.3208525

Leveraging Balanced Semantic Embedding for Generative Zero-Shot Learning

IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):9575-9582. doi: 10.1109/TNNLS.2022.3208525. Epub 2023 Oct 27.

Authors

Guo-Sen Xie, Xu-Yao Zhang, Tian-Zhu Xiang, Fang Zhao, Zheng Zhang, Ling Shao, Xuelong Li

PMID: 36269927
DOI: 10.1109/TNNLS.2022.3208525

Abstract

Generative (generalized) zero-shot learning [(G)ZSL] models aim to synthesize unseen class features by using only seen class feature and attribute pairs as training data. However, the generated fake unseen features tend to be dominated by the seen class features and thus classified as seen classes, which can lead to inferior performances under zero-shot learning (ZSL), and unbalanced results under generalized ZSL (GZSL). To address this challenge, we tailor a novel balanced semantic embedding generative network (BSeGN), which incorporates balanced semantic embedding learning into generative learning scenarios in the pursuit of unbiased GZSL. Specifically, we first design a feature-to-semantic embedding module (FEM) to distinguish real seen and fake unseen features collaboratively with the generator in an online manner. We introduce the bidirectional contrastive and balance losses for the FEM learning, which can guarantee a balanced prediction for the interdomain features. In turn, the updated FEM can boost the learning of the generator. Next, we propose a multilevel feature integration module (mFIM) from the cycle-consistency branch of BSeGN, which can mitigate the domain bias through feature enhancement. To the best of our knowledge, this is the first work to explore embedding and generative learning jointly within the field of ZSL. Extensive evaluations on four benchmarks demonstrate the superiority of BSeGN over its state-of-the-art counterparts.