Reinforcement learning-based formation-surrounding control for multiple quadrotor UAVs pursuit-evasion games

Hang Xiong; Ying Zhang

doi:10.1016/j.isatra.2023.12.006

Reinforcement learning-based formation-surrounding control for multiple quadrotor UAVs pursuit-evasion games

ISA Trans. 2024 Feb:145:205-224. doi: 10.1016/j.isatra.2023.12.006. Epub 2023 Dec 8.

Authors

Hang Xiong¹, Ying Zhang²

Affiliations

¹ Logistics Engineering College, Shanghai Maritime University, Shanghai 201306, China.
² College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China. Electronic address: yingzhang@shmtu.edu.cn.

PMID: 38105171
DOI: 10.1016/j.isatra.2023.12.006

Abstract

This paper proposes a reinforcement learning-based formation-surrounding control method for multiple quadrotor unmanned aerial vehicles (UAVs) pursuit-evasion (MPE) games system subject to external disturbances. In the framework of the MPE games, the pursuers aim to equally surround the evaders which try to avoid being surrounded when forming the desired formation. By constructing position and attitude tracking error subsystems of quadrotor UAV, this paper proposes two control strategies which combines the feedforward control technique and reinforcement learning (RL) method. First, two novel cost functions are presented for the quadrotor UAV with external disturbances. Then, two control schemes based on RL have been developed to guarantee the stability of the tracking error subsystem. Subsequently, two critic-only neural networks (NN) weight update laws that only satisfy finite excitation conditions are proposed to estimate the optimal cost function. Furthermore, Nash equilibrium for multiple quadrotor UAVs is achieved by means of RL strategy to solve the Hamilton-Jacobi-Isaacs (HJI) equations. And the property of equally surrounding is proved for the first time by utilizing Euler's formula in this paper. Finally, the numerical simulation results are given to show the effectiveness and superior performance of the proposed control method.

Keywords: Equally surrounding control; Formation control; Multiple quadrotor UAVs; Pursuit-Evasion games; Reinforcement learning.