Intelligent Smart Marine Autonomous Surface Ship Decision System Based on Improved PPO Algorithm

Wei Guan; Zhewen Cui; Xianku Zhang

doi:10.3390/s22155732

Intelligent Smart Marine Autonomous Surface Ship Decision System Based on Improved PPO Algorithm

Sensors (Basel). 2022 Jul 31;22(15):5732. doi: 10.3390/s22155732.

Authors

Wei Guan¹, Zhewen Cui¹, Xianku Zhang¹

Affiliation

¹ Navigation College, Dalian Maritime University, Dalian 116026, China.

Abstract

With the development of artificial intelligence technology, the behavior decision-making of an intelligent smart marine autonomous surface ship (SMASS) has become particularly important. This research proposed local path planning and a behavior decision-making approach based on improved Proximal Policy Optimization (PPO), which could drive an unmanned SMASS to the target without requiring any human experiences. In addition, a generalized advantage estimation was added to the loss function of the PPO algorithm, which allowed baselines in PPO algorithms to be self-adjusted. At first, the SMASS was modeled with the Nomoto model in a simulation waterway. Then, distances, obstacles, and prohibited areas were regularized as rewards or punishments, which were used to judge the performance and manipulation decisions of the vessel Subsequently, improved PPO was introduced to learn the action-reward model, and the neural network model after training was used to manipulate the SMASS's movement. To achieve higher reward values, the SMASS could find an appropriate path or navigation strategy by itself. After a sufficient number of rounds of training, a convincing path and manipulation strategies would likely be produced. Compared with the proposed approach of the existing methods, this approach is more effective in self-learning and continuous optimization and thus closer to human manipulation.

Keywords: Nomoto; PPO; SMASS; decision-making; deep reinforcement learning.

MeSH terms

Algorithms
Artificial Intelligence*
Humans
Neural Networks, Computer
Policy
Ships*

Abstract

MeSH terms

Grants and funding