UAVs Maneuver Decision-Making Method Based on Transfer Reinforcement Learning

Comput Intell Neurosci. 2022 Nov 14:2022:2399796. doi: 10.1155/2022/2399796. eCollection 2022.

Abstract

Aiming at the 1vs1 confrontation problem in a complex environment where obstacles are randomly distributed, the DDPG (deep deterministic policy gradient) algorithm is used to design the maneuver decision-making method of UAVs. Traditional methods generally assume that all obstacles are known globally. In this paper, a UAV airborne lidar detection model is designed, which can effectively solve the problem of obstacle avoidance when facing a large number of unknown obstacles. On the basis of the designed model, the idea of transfer learning is used to transfer the strategy trained by one UAV in a simple task to a new similar task, and the strategy will be used to train the strategy of the other UAV. This method can improve the intelligence of the UAVs in both sides alternately and progressively. The simulation results show that the transfer learning method can speed up the training process and improve the training effect.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Reinforcement, Psychology*