Approximating Nash equilibrium for anti-UAV jamming Markov game using a novel event-triggered multi-agent reinforcement learning

Zikai Feng; Mengxing Huang; Yuanyuan Wu; Di Wu; Jinde Cao; Iakov Korovin; Sergey Gorbachev; Nadezhda Gorbacheva

doi:10.1016/j.neunet.2022.12.022

Approximating Nash equilibrium for anti-UAV jamming Markov game using a novel event-triggered multi-agent reinforcement learning

Neural Netw. 2023 Apr:161:330-342. doi: 10.1016/j.neunet.2022.12.022. Epub 2023 Feb 2.

Authors

Zikai Feng¹, Mengxing Huang², Yuanyuan Wu³, Di Wu⁴, Jinde Cao⁵, Iakov Korovin⁶, Sergey Gorbachev⁷, Nadezhda Gorbacheva⁸

Affiliations

¹ School of Information and Communication Engineering, Hainan University, Haikou, 570228, China; State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, 570228, China. Electronic address: 13523011824@163.com.
² School of Information and Communication Engineering, Hainan University, Haikou, 570228, China; State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, 570228, China. Electronic address: huangmx09@163.com.
³ School of Information and Communication Engineering, Hainan University, Haikou, 570228, China. Electronic address: wyuanyuan82@163.com.
⁴ Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Information and Communication Engineering, Hainan University, Haikou, 570228, China. Electronic address: hainuwudi@163.com.
⁵ School of Mathematics, Southeast University, Nanjing, 210096, China; Yonsei Frontier Lab, Yonsei University, Seoul 03722, South Korea. Electronic address: jdcao@seu.edu.cn.
⁶ Scientific Research Institute of Multiprocessor Computer Systems, Southern Federal University, 2, Chekhov st., Taganrog, 347928, Russia. Electronic address: korovin_yakov@mail.ru.
⁷ Russian Academy of Engineering, 9, building 4, Gazetny pereulok, Moscow, 125009, Russia. Electronic address: hanuman1000@mail.ru.
⁸ Scientific Research Institute of Multiprocessor Computer Systems, Southern Federal University, 2, Chekhov st., Taganrog, 347928, Russia. Electronic address: nadia7@sibmail.com.

PMID: 36774870
DOI: 10.1016/j.neunet.2022.12.022

Abstract

In the downlink communication, it is currently challenging for ground users to cope with the uncertain interference from aerial intelligent jammers. The cooperation and competition between ground users and unmanned aerial vehicle (UAV) jammers leads to a Markov game problem of anti-UAV jamming. Therefore, a model-free method is adopted based on multi-agent reinforcement learning (MARL) to handle the Markov game. However, the benchmark MARL strategies suffer from dimension explosion and local optimal convergence. To solve these issues, a novel event-triggered multi-agent proximal policy optimization algorithm with Beta strategy (ETMAPPO) is proposed in this paper, which aims to reduce the dimension of information transmission and improve the efficiency of policy convergence. In this event-triggering mechanism, agents can learn to obtain appropriate observation in different moment, thereby reducing the transmission of valueless information. Beta operator is used to optimize the action search. It expands the search scope of policy space. Ablation simulations show that the proposed strategy achieves better global benefits with fewer dimension of information than benchmark algorithms. In addition, the convergence performance verifies that the well-trained ETMAPPO has the capability to achieve stable jamming strategies and stable anti-jamming strategies. This approximately constitutes the Nash equilibrium of the anti-jamming Markov game.

Keywords: Anti-jamming Markov game; Beta strategy; Event-triggered multi-agent deep reinforcement learning; Nash equilibrium.

MeSH terms

Algorithms
Benchmarking
Learning*
Reinforcement, Psychology
Unmanned Aerial Devices*