Multi-view graph representation with similarity diffusion for general zero-shot learning

Beibei Yu; Cheng Xie; Peng Tang; Haoran Duan

doi:10.1016/j.neunet.2023.06.045

Multi-view graph representation with similarity diffusion for general zero-shot learning

Neural Netw. 2023 Sep:166:38-50. doi: 10.1016/j.neunet.2023.06.045. Epub 2023 Jul 7.

Authors

Beibei Yu¹, Cheng Xie², Peng Tang³, Haoran Duan⁴

Affiliations

¹ School of Software, Yunnan University, Kunming, 650500, China. Electronic address: yubeibei@mail.ynu.edu.cn.
² School of Software, Yunnan University, Kunming, 650500, China. Electronic address: xiecheng@ynu.edu.cn.
³ School of Software, Yunnan University, Kunming, 650500, China. Electronic address: tangpeng@mail.ynu.edu.cn.
⁴ School of Software, Yunnan University, Kunming, 650500, China. Electronic address: duanhaoran@ynu.edu.cn.

PMID: 37480768
DOI: 10.1016/j.neunet.2023.06.045

Abstract

Zero-shot learning (ZSL) aims to predict unseen classes without using samples of these classes in model training. The ZSL has been widely used in many knowledge-based models and applications to predict various parameters, including categories, subjects, and anomalies, in different domains. Nonetheless, most existing ZSL methods require the pre-defined semantics or attributes of particular data environments. Therefore, these methods are difficult to be applied to general data environments, such as ImageNet and other real-world datasets and applications. Recent research has tried to use open knowledge to enhance the ZSL methods to adapt it to an open data environment. However, the performance of these methods is relatively low, namely the accuracy is normally below 10%, which is due to the inadequate semantics that can be used from open knowledge. Moreover, the latest methods suffer from a significant "semantic gap" problem between the generated features of unseen classes and the real features of seen classes. To this end, this paper proposes a multi-view graph representation with a similarity diffusion model, applying the ZSL tasks to general data environments. This model applies a multi-view graph to enhance the semantics fully and proposes an innovative diffusion method to augment the graph representation. In addition, a feature diffusion method is proposed to augment the multi-view graph representation and bridge the semantic gap to realize zero-shot predicting. The results of numerous experiments in general data environments and on benchmark datasets show that the proposed method can achieve new state-of-the-art results in the field of general zero-shot learning. Furthermore, seven ablation studies analyze the effects of the settings and different modules of the proposed method on its performance in detail and prove the effectiveness of each module.

Keywords: Feature diffusion; Graph representation; Knowledge graph; Knowledge-based model; Zero-shot learning.

MeSH terms

Benchmarking*
Diffusion
Humans
Knowledge
Knowledge Bases
Learning*