Learning hybrid locomotion skills-Learn to exploit residual actions and modulate model-based gait control

Mohammadreza Kasaei; Miguel Abreu; Nuno Lau; Artur Pereira; Luis Paulo Reis; Zhibin Li

doi:10.3389/frobt.2023.1004490

Learning hybrid locomotion skills-Learn to exploit residual actions and modulate model-based gait control

Front Robot AI. 2023 Apr 10:10:1004490. doi: 10.3389/frobt.2023.1004490. eCollection 2023.

Authors

Mohammadreza Kasaei¹, Miguel Abreu², Nuno Lau³, Artur Pereira³, Luis Paulo Reis², Zhibin Li⁴

Affiliations

¹ School of Informatics, University of Edinburgh, Edinburgh, United Kingdom.
² University of Porto, LIACC / LASI / FEUP, Artificial Intelligence and Computer Science Lab, Faculty of Engineering of the University of Porto, Porto, Portugal.
³ IEETA / LASI / DETI University of Aveiro, Aveiro, Portugal.
⁴ Department of Computer Science, University College London, London, United Kingdom.

Abstract

This work has developed a hybrid framework that combines machine learning and control approaches for legged robots to achieve new capabilities of balancing against external perturbations. The framework embeds a kernel which is a model-based, full parametric closed-loop and analytical controller as the gait pattern generator. On top of that, a neural network with symmetric partial data augmentation learns to automatically adjust the parameters for the gait kernel, and also generate compensatory actions for all joints, thus significantly augmenting the stability under unexpected perturbations. Seven Neural Network policies with different configurations were optimized to validate the effectiveness and the combined use of the modulation of the kernel parameters and the compensation for the arms and legs using residual actions. The results validated that modulating kernel parameters alongside the residual actions have improved the stability significantly. Furthermore, The performance of the proposed framework was evaluated across a set of challenging simulated scenarios, and demonstrated considerable improvements compared to the baseline in recovering from large external forces (up to 118%). Besides, regarding measurement noise and model inaccuracies, the robustness of the proposed framework has been assessed through simulations, which demonstrated the robustness in the presence of these uncertainties. Furthermore, the trained policies were validated across a set of unseen scenarios and showed the generalization to dynamic walking.

Keywords: deep reinforcement learning (DRL); humanoid robot; learning motor skills; learning residual actions; modulate gait generator.

Grants and funding

This work has been supported by Portuguese National Funds through the FCT - Foundation for Science and Technology, in the context of the project UIDB/00127/2020. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version 470 arising from this submission.