Machine learning through cryptographic glasses: combating adversarial attacks by key-based diversified aggregation

EURASIP J Inf Secur. 2020;2020(1):10. doi: 10.1186/s13635-020-00106-x. Epub 2020 Jun 1.

Abstract

In recent years, classification techniques based on deep neural networks (DNN) were widely used in many fields such as computer vision, natural language processing, and self-driving cars. However, the vulnerability of the DNN-based classification systems to adversarial attacks questions their usage in many critical applications. Therefore, the development of robust DNN-based classifiers is a critical point for the future deployment of these methods. Not less important issue is understanding of the mechanisms behind this vulnerability. Additionally, it is not completely clear how to link machine learning with cryptography to create an information advantage of the defender over the attacker. In this paper, we propose a key-based diversified aggregation (KDA) mechanism as a defense strategy in a gray- and black-box scenario. KDA assumes that the attacker (i) knows the architecture of classifier and the used defense strategy, (ii) has an access to the training data set, but (iii) does not know a secret key and does not have access to the internal states of the system. The robustness of the system is achieved by a specially designed key-based randomization. The proposed randomization prevents the gradients' back propagation and restricts the attacker to create a "bypass" system. The randomization is performed simultaneously in several channels. Each channel introduces its own randomization in a special transform domain. The sharing of a secret key between the training and test stages creates an information advantage to the defender. Finally, the aggregation of soft outputs from each channel stabilizes the results and increases the reliability of the final score. The performed experimental evaluation demonstrates a high robustness and universality of the KDA against state-of-the-art gradient-based gray-box transferability attacks and the non-gradient-based black-box attacks (The results reported in this paper have been partially presented in CVPR 2019 (Taran et al., Defending against adversarial attacks by randomized diversification, 2019) & ICIP 2019 (Taran et al., Robustification of deep net classifiers by key-based diversified aggregation with pre-filtering, 2019)).

Keywords: Adversarial examples; Black-box attacks; Defense; Diversified aggregation; Machine learning; Non-gradient/gradient-based attacks; Randomization.