A Hybrid MPC for Constrained Deep Reinforcement Learning applied for Planar Robotic Arm

ISA Trans. 2021 Apr 1:S0019-0578(21)00195-6. doi: 10.1016/j.isatra.2021.03.046. Online ahead of print.

Abstract

Recently, deep reinforcement learning techniques have achieved tangible results for learning high dimensional control tasks. Due to the trial and error interaction, between the autonomous agent and the environment, the learning phase is unconstrained and limited to the simulator. Such exploration has an additional drawback of consuming unnecessary samples at the beginning of the learning process. Model-based algorithms, on the other hand, handle this issue by learning the dynamics of the environment. However, model-free algorithms have a higher asymptotic performance than model-based ones. The main contribution of this paper is to construct a hybrid structured algorithm from model predictive control (MPC) and deep reinforcement learning (DRL) (MPC-DRL), that makes use of the benefits of both methods, to satisfy constraint conditions throughout the learning process. The validity of the proposed approach is demonstrated by learning a reachability task. The results show complete satisfaction for the constraint condition, represented by a static obstacle, with a smaller number of samples and higher performance compared to state-of-the-art model-free algorithms.

Keywords: Artificial intelligence; Deep learning; Deep reinforcement learning; Hybrid controller; MPC; Machine learning; Planner robot; Reinforcement learning.