Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments

Oualid Doukhi; Deok-Jin Lee

doi:10.3390/s21072534

Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments

Sensors (Basel). 2021 Apr 4;21(7):2534. doi: 10.3390/s21072534.

Authors

Oualid Doukhi¹, Deok-Jin Lee²

Affiliations

¹ Center for Artificial Intelligence & Autonomous Systems, Kunsan National University, 558 Daehak-ro, Naun 2(i)-dong, Gunsan 54150, Jeollabuk-do, Korea.
² School of Mechanical Design Engineering, Smart e-Mobilty Lab, Center for Artificial Intelligence & Autonomous Systems, Jeonbuk National University, 567, Baekje-daero, Deokjin-gu, Jeonju-si 54896, Jeollabuk-do, Korea.

Abstract

Autonomous navigation and collision avoidance missions represent a significant challenge for robotics systems as they generally operate in dynamic environments that require a high level of autonomy and flexible decision-making capabilities. This challenge becomes more applicable in micro aerial vehicles (MAVs) due to their limited size and computational power. This paper presents a novel approach for enabling a micro aerial vehicle system equipped with a laser range finder to autonomously navigate among obstacles and achieve a user-specified goal location in a GPS-denied environment, without the need for mapping or path planning. The proposed system uses an actor-critic-based reinforcement learning technique to train the aerial robot in a Gazebo simulator to perform a point-goal navigation task by directly mapping the noisy MAV's state and laser scan measurements to continuous motion control. The obtained policy can perform collision-free flight in the real world while being trained entirely on a 3D simulator. Intensive simulations and real-time experiments were conducted and compared with a nonlinear model predictive control technique to show the generalization capabilities to new unseen environments, and robustness against localization noise. The obtained results demonstrate our system's effectiveness in flying safely and reaching the desired points by planning smooth forward linear velocity and heading rates.

Keywords: autonomous navigation; collision-free; deep reinforcement learning; unmanned aerial vehicle.

Grants and funding

2019R1F1A1049711 , NRF-2020M3C1C1A02084772/This research was funded and conducted under the Competency Development Program for Industry Specialists of the Korean Ministry of Trade, Industry, and Energy (MOTIE), operated by the Korea Institute for Advancement of Technology (KIAT). (No. N0002428, HR