Path Planning for Unmanned Surface Vehicles with Strong Generalization Ability Based on Improved Proximal Policy Optimization

Pengqi Sun; Chunxi Yang; Xiaojie Zhou; Wenbo Wang

doi:10.3390/s23218864

Path Planning for Unmanned Surface Vehicles with Strong Generalization Ability Based on Improved Proximal Policy Optimization

Sensors (Basel). 2023 Oct 31;23(21):8864. doi: 10.3390/s23218864.

Authors

Pengqi Sun^{1

2}, Chunxi Yang^{1

2}, Xiaojie Zhou^{1

2}, Wenbo Wang^{1

2}

Affiliations

¹ Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming 650500, China.
² Yunnan International Joint Laboratory of Intelligent Control and Application of Advanced Equipment, Kunming 650500, China.

Abstract

To solve the problems of path planning and dynamic obstacle avoidance for an unmanned surface vehicle (USV) in a locally observable non-dynamic ocean environment, a visual perception and decision-making method based on deep reinforcement learning is proposed. This method replaces the full connection layer in the Proximal Policy Optimization (PPO) neural network structure with a convolutional neural network (CNN). In this way, the degree of memorization and forgetting of sample information is controlled. Moreover, this method accumulates reward models faster by preferentially learning samples with high reward values. From the USV-centered radar perception input of the local environment, the output of the action is realized through an end-to-end learning model, and the environment perception and decision are formed as a closed loop. Thus, the proposed algorithm has good adaptability in different marine environments. The simulation results show that, compared with the PPO algorithm, Soft Actor-Critic (SAC) algorithm, and Deep Q Network (DQN) algorithm, the proposed algorithm can accelerate the model convergence speed and improve the path planning performances in partly or fully unknown ocean fields.

Keywords: USV; deep neural network; deep reinforcement learning; generalization; path planning; perception.

Abstract

Grants and funding