A Universal Detection Method for Adversarial Examples and Fake Images

Sensors (Basel). 2022 Apr 30;22(9):3445. doi: 10.3390/s22093445.

Abstract

Deep-learning technologies have shown impressive performance on many tasks in recent years. However, there are multiple serious security risks when using deep-learning technologies. For examples, state-of-the-art deep-learning technologies are vulnerable to adversarial examples that make the model's predictions wrong due to some specific subtle perturbation, and these technologies can be abused for the tampering with and forgery of multimedia, i.e., deep forgery. In this paper, we propose a universal detection framework for adversarial examples and fake images. We observe some differences in the distribution of model outputs for normal and adversarial examples (fake images) and train the detector to learn the differences. We perform extensive experiments on the CIFAR10 and CIFAR100 datasets. Experimental results show that the proposed framework has good feasibility and effectiveness in detecting adversarial examples or fake images. Moreover, the proposed framework has good generalizability for the different datasets and model structures.

Keywords: adversarial example; deep forgery; detection.

MeSH terms

  • Deep Learning*
  • Multimedia
  • Neural Networks, Computer*