Reinforcement learning based variable damping control of wearable robotic limbs for maintaining astronaut pose during extravehicular activity

Sikai Zhao; Tianjiao Zheng; Dongbao Sui; Jie Zhao; Yanhe Zhu

doi:10.3389/fnbot.2023.1093718

Reinforcement learning based variable damping control of wearable robotic limbs for maintaining astronaut pose during extravehicular activity

Front Neurorobot. 2023 Feb 15:17:1093718. doi: 10.3389/fnbot.2023.1093718. eCollection 2023.

Authors

Sikai Zhao¹, Tianjiao Zheng¹, Dongbao Sui¹, Jie Zhao¹, Yanhe Zhu¹

Affiliation

¹ State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China.

Abstract

As astronauts perform on-orbit servicing of extravehicular activity (EVA) without the help of the space station's robotic arms, it will be rather difficult and labor-consuming to maintain the appropriate position in case of impact. In order to solve this problem, we propose the development of a wearable robotic limb system for astronaut assistance and a variable damping control method for maintaining the astronaut's position. The requirements of the astronaut's impact-resisting ability during EVA were analyzed, including the capabilities of deviation resistance, fast return, oscillation resistance, and accurate return. To meet these needs, the system of the astronaut with robotic limbs was modeled and simplified. In combination with this simplified model and a reinforcement learning algorithm, a variable damping controller for the end of the robotic limb was obtained, which can regulate the dynamic performance of the robot end to resist oscillation after impact. A weightless simulation environment for the astronaut with robotic limbs was constructed. The simulation results demonstrate that the proposed method can meet the recommended requirements for maintaining an astronaut's position during EVA. No matter how the damping coefficient was set, the fixed damping control method failed to meet all four requirements at the same time. In comparison to the fixed damping control method, the variable damping controller proposed in this paper fully satisfied all the impact-resisting requirements by itself. It could prevent excessive deviation from the original position and was able to achieve a fast return to the starting point. The maximum deviation displacement was reduced by 39.3% and the recovery time was cut by 17.7%. Besides, it also had the ability to prevent reciprocating oscillation and return to the original position accurately.

Keywords: extravehicular activity; modular robot; reinforcement learning; variable damping control; wearable robotic limbs.

Grants and funding

This research was funded by the National Natural Science Foundation of China (NSFC) (Nos. 52025054 and 52105016).