Remix: Towards the transferability of adversarial examples

Hongzhi Zhao; Lingguang Hao; Kuangrong Hao; Bing Wei; Xin Cai

doi:10.1016/j.neunet.2023.04.012

Remix: Towards the transferability of adversarial examples

Neural Netw. 2023 Jun:163:367-378. doi: 10.1016/j.neunet.2023.04.012. Epub 2023 Apr 18.

Authors

Hongzhi Zhao¹, Lingguang Hao¹, Kuangrong Hao², Bing Wei¹, Xin Cai³

Affiliations

¹ College of Information Science and Technology, Donghua University, 201620, Shanghai, China; Engineering Research Center of Digitized Textile and Apparel Technology, Ministry of Education, Donghua University, 201620, Shanghai, China.
² College of Information Science and Technology, Donghua University, 201620, Shanghai, China; Engineering Research Center of Digitized Textile and Apparel Technology, Ministry of Education, Donghua University, 201620, Shanghai, China. Electronic address: krhao@dhu.edu.cn.
³ College of Information Science and Technology, Donghua University, 201620, Shanghai, China; Engineering Research Center of Digitized Textile and Apparel Technology, Ministry of Education, Donghua University, 201620, Shanghai, China. Electronic address: xcai@dhu.edu.cn.

PMID: 37119676
DOI: 10.1016/j.neunet.2023.04.012

Abstract

Deep neural networks (DNNs) are susceptible to adversarial examples, which are crafted by deliberately adding some human-imperceptible perturbations on original images. To explore the vulnerability of models of DNNs, transfer-based black-box attacks are attracting increasing attention of researchers credited to their high practicality. The transfer-based approaches can launch attacks against models easily in the black-box setting by resultant adversarial examples, whereas the success rates are not satisfactory. To boost the adversarial transferability, we propose a Remix method with multiple input transformations, which could achieve multiple data augmentation by utilizing gradients from previous iterations and images from other categories in the same iteration. Extensive experiments on the NeurIPS 2017 adversarial dataset and the ILSVRC 2012 validation dataset demonstrate that the proposed approach could drastically enhance the adversarial transferability and maintain similar success rates of white-box attacks on both undefended models and defended models. Furthermore, extended experiments based on LPIPS show that our method could maintain a similar perceived distance compared to other baselines.

Keywords: Adversarial transferability; Black-box attack; Deep neural networks.

MeSH terms

Humans
Neural Networks, Computer*