Graph embedding based multi-label Zero-shot Learning

Neural Netw. 2023 Oct:167:129-140. doi: 10.1016/j.neunet.2023.08.023. Epub 2023 Aug 19.

Abstract

Multi-label Zero-shot Learning (ZSL) is more reasonable and realistic than standard single-label ZSL because several objects can co-exist in a natural image in real scenarios. Intra-class feature entanglement is a significant factor influencing the alignment of visual and semantic features, resulting in the model's inability to recognize unseen samples comprehensively and completely. We observe that existing multi-label ZSL methods place a greater emphasis on attention-based refinement and decoupling of visual features, while ignoring the relationship between label semantics. Relying on label correlations to solve multi-label ZSL tasks has not been deeply studied. In this paper, we make full use of the co-occurrence relationship between category labels and build a directed weighted semantic graph based on statistics and prior knowledge, in which node features represent category semantics and weighted edges represent conditional probabilities of label co-occurrence. To guide the targeted extraction of visual features, node features and edge set weights are simultaneously updated and refined, and embedded into the visual feature extraction network from a global and local perspective. The proposed method's effectiveness was demonstrated by simulation results on two challenging multi-label ZSL benchmarks: NUS-WIDE and Open Images. In comparison to state-of-the-art models, our model achieves an absolute gain of 2.4% mAP on NUS-WIDE and 2.1% mAP on Open Images respectively.

Keywords: Feature embedding; Knowledge graph; Multi-label classification; Zero-shot Learning.

MeSH terms

  • Benchmarking*
  • Computer Simulation
  • Knowledge
  • Learning*
  • Probability