Explanatory subgraph attacks against Graph Neural Networks

Neural Netw. 2024 Apr:172:106097. doi: 10.1016/j.neunet.2024.106097. Epub 2024 Jan 23.

Abstract

Graph Neural Networks (GNNs) are often viewed as black boxes due to their lack of transparency, which hinders their application in critical fields. Many explanation methods have been proposed to address the interpretability issue of GNNs. These explanation methods reveal explanatory information about graphs from different perspectives. However, the explanatory information may also pose an attack risk to GNN models. In this work, we will explore this problem from the explanatory subgraph perspective. To this end, we utilize a powerful GNN explanation method called SubgraphX and deploy it locally to obtain explanatory subgraphs from given graphs. Then we propose methods for conducting evasion attacks and backdoor attacks based on the local explainer. In evasion attacks, the attacker gets explanatory subgraphs of test graphs from the local explainer and replace their explanatory subgraphs with an explanatory subgraph of other labels, making the target model misclassify test graphs as wrong labels. In backdoor attacks, the attacker employs the local explainer to select an explanatory trigger and locate suitable injection locations. We validate the effectiveness of our proposed attacks on state-of-art GNN models and different datasets. The results also demonstrate that our proposed backdoor attack is more efficient, adaptable, and concealed than previous backdoor attacks.

Keywords: Adversarial attacks; Backdoor attacks; Explainability; Graph Neural Networks.

MeSH terms

  • Neural Networks, Computer*