Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection

Kun Zhou; Wenyong Wang; Teng Hu; Kai Deng

doi:10.3390/e23030274

Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection

Entropy (Basel). 2021 Feb 25;23(3):274. doi: 10.3390/e23030274.

Authors

Kun Zhou^{1

2}, Wenyong Wang¹, Teng Hu^{1

2}, Kai Deng²

Affiliations

¹ School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
² Institute for Computer Application, China Academy of Engineering Physics, Mianyang 621900, China.

Abstract

Anomaly detection research was conducted traditionally using mathematical and statistical methods. This topic has been widely applied in many fields. Recently reinforcement learning has achieved exceptional successes in many areas such as the AlphaGo chess playing and video gaming etc. However, there were scarce researches applying reinforcement learning to the field of anomaly detection. This paper therefore aimed at proposing an adaptable asynchronous advantage actor-critic model of reinforcement learning to this field. The performances were evaluated and compared among classical machine learning and the generative adversarial model with variants. Basic principles of the related models were introduced firstly. Then problem definitions, modelling processes and testing were detailed. The proposed model differentiated the sequence and image from other anomalies by proposing appropriate neural networks of attention mechanism and convolutional network for the two kinds of anomalies, respectively. Finally, performances with classical models using public benchmark datasets (NSL-KDD, AWID and CICIDS-2017, DoHBrw-2020) were evaluated and compared. Experiments confirmed the effectiveness of the proposed model with the results indicating higher rewards and lower loss rates on the datasets during training and testing. The metrics of precision, recall rate and F1 score were higher than or at least comparable to the state-of-the-art models. We concluded the proposed model could outperform or at least achieve comparable results with the existing anomaly detection models.

Keywords: anomaly detection; asynchronous advantage actor-critic; generative adversarial network; reinforcement learning.

Grants and funding

XH35 and SJ2019A05/China Academy of Engineering Physics