Defense against adversarial attacks based on color space transformation

Neural Netw. 2024 May:173:106176. doi: 10.1016/j.neunet.2024.106176. Epub 2024 Feb 14.

Abstract

Deep Learning algorithms have achieved state-of-the-art performance in various important tasks. However, recent studies have found that an elaborate perturbation may cause a network to misclassify, which is known as an adversarial attack. Based on current research, it is suggested that adversarial examples cannot be eliminated completely. Consequently, it is always possible to determine an attack that is effective against a defense model. We render existing adversarial examples invalid by altering the classification boundaries. Meanwhile, for valid adversarial examples generated against the defense model, the adversarial perturbations are increased so that they can be distinguished by the human eye. This paper proposes a method for implementing the abovementioned concepts through color space transformation. Experiments on CIFAR-10, CIFAR-100, and Mini-ImageNet demonstrate the effectiveness and versatility of our defense method. To the best of our knowledge, this is the first defense model based on the amplification of adversarial perturbations.

Keywords: Adversarial attack; Adversarial defense; Deep learning; Robustness.

MeSH terms

  • Algorithms*
  • Humans
  • Knowledge*