A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG

Yanbiao Li; Zhao Chen; Chentao Wu; Haoyu Mao; Peng Sun

doi:10.3390/biomimetics8050382

A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG

Biomimetics (Basel). 2023 Aug 22;8(5):382. doi: 10.3390/biomimetics8050382.

Authors

Yanbiao Li^{1

2}, Zhao Chen^{1

2

3}, Chentao Wu^{1

2}, Haoyu Mao^{1

2

3}, Peng Sun^{1

2

3}

Affiliations

¹ College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou 310023, China.
² Key Laboratory of Special Purpose Equipment and Advanced Processing Technology, Ministry of Education and Zhejiang Province, Zhejiang University of Technology, Hangzhou 310023, China.
³ Huzhou Institute of Digital Economy and Technology, Zhejiang University of Technology, Huzhou 313000, China.

Abstract

In recent years, significant progress has been made in employing reinforcement learning for controlling legged robots. However, a major challenge arises with quadruped robots due to their continuous states and vast action space, making optimal control using simple reinforcement learning controllers particularly challenging. This paper introduces a hierarchical reinforcement learning framework based on the Deep Deterministic Policy Gradient (DDPG) algorithm to achieve optimal motion control for quadruped robots. The framework consists of a high-level planner responsible for generating ideal motion parameters, a low-level controller using model predictive control (MPC), and a trajectory generator. The agents within the high-level planner are trained to provide the ideal motion parameters for the low-level controller. The low-level controller uses MPC and PD controllers to generate the foot-end force and calculates the joint motor torque through inverse kinematics. The simulation results show that the motion performance of the trained hierarchical framework is superior to that obtained using only the DDPG method.

Keywords: Deep Deterministic Policy Gradient; hierarchical reinforcement learning; model predictive control; quadruped robots.

Grants and funding

U21A20122,52105037,51975523/National Natural Science Foundation of China