Remix: Towards the transferability of adversarial examples

Neural Netw. 2023 Jun:163:367-378. doi: 10.1016/j.neunet.2023.04.012. Epub 2023 Apr 18.

Abstract

Deep neural networks (DNNs) are susceptible to adversarial examples, which are crafted by deliberately adding some human-imperceptible perturbations on original images. To explore the vulnerability of models of DNNs, transfer-based black-box attacks are attracting increasing attention of researchers credited to their high practicality. The transfer-based approaches can launch attacks against models easily in the black-box setting by resultant adversarial examples, whereas the success rates are not satisfactory. To boost the adversarial transferability, we propose a Remix method with multiple input transformations, which could achieve multiple data augmentation by utilizing gradients from previous iterations and images from other categories in the same iteration. Extensive experiments on the NeurIPS 2017 adversarial dataset and the ILSVRC 2012 validation dataset demonstrate that the proposed approach could drastically enhance the adversarial transferability and maintain similar success rates of white-box attacks on both undefended models and defended models. Furthermore, extended experiments based on LPIPS show that our method could maintain a similar perceived distance compared to other baselines.

Keywords: Adversarial transferability; Black-box attack; Deep neural networks.

MeSH terms

  • Humans
  • Neural Networks, Computer*