Velocity range-based reward shaping technique for effective map-less navigation with LiDAR sensor and deep reinforcement learning

HyeokSoo Lee; Jongpil Jeong

doi:10.3389/fnbot.2023.1210442

Velocity range-based reward shaping technique for effective map-less navigation with LiDAR sensor and deep reinforcement learning

Front Neurorobot. 2023 Sep 6:17:1210442. doi: 10.3389/fnbot.2023.1210442. eCollection 2023.

Authors

HyeokSoo Lee^{1

2}, Jongpil Jeong¹

Affiliations

¹ Department of Smart Factory Convergence, AI Factory Lab, Sungkyunkwan University, Suwon, Republic of Korea.
² Research & Development Team, THiRA-UTECH Co., Ltd., Seoul, Republic of Korea.

Abstract

In recent years, sensor components similar to human sensory functions have been rapidly developed in the hardware field, enabling the acquisition of information at a level beyond that of humans, and in the software field, artificial intelligence technology has been utilized to enable cognitive abilities and decision-making such as prediction, analysis, and judgment. These changes are being utilized in various industries and fields. In particular, new hardware and software technologies are being rapidly applied to robotics products, showing a level of performance and completeness that was previously unimaginable. In this paper, we researched the topic of establishing an optimal path plan for autonomous driving using LiDAR sensors and deep reinforcement learning in a workplace without map and grid coordinates for mobile robots, which are widely used in logistics and manufacturing sites. For this purpose, we reviewed the hardware configuration of mobile robots capable of autonomous driving, checked the characteristics of the main core sensors, and investigated the core technologies of autonomous driving. In addition, we reviewed the appropriate deep reinforcement learning algorithm to realize the autonomous driving of mobile robots, defined a deep neural network for autonomous driving data conversion, and defined a reward function for path planning. The contents investigated in this paper were built into a simulation environment to verify the autonomous path planning through experiment, and an additional reward technique "Velocity Range-based Evaluation Method" was proposed for further improvement of performance indicators required in the real field, and the effectiveness was verified. The simulation environment and detailed results of experiments are described in this paper, and it is expected as guidance and reference research for applying these technologies in the field.

Keywords: LiDAR; SLAM; autonomous mobile robot; continuous action; deep reinforcement learning; map-less navigation; reward shaping.

Grants and funding

This research was supported by the SungKyunKwan University and the BK21 FOUR (Graduate School Innovation) and funded by the Ministry of Education (MOE, Korea) and National Research Foundation of Korea (NRF). Moreover, this research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2023-2018-0-01417) and supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation).