Extended residual learning with one-shot imitation learning for robotic assembly in semi-structured environment

Front Neurorobot. 2024 Apr 29:18:1355170. doi: 10.3389/fnbot.2024.1355170. eCollection 2024.

Abstract

Introduction: Robotic assembly tasks require precise manipulation and coordination, often necessitating advanced learning techniques to achieve efficient and effective performance. While residual reinforcement learning with a base policy has shown promise in this domain, existing base policy approaches often rely on hand-designed full-state features and policies or extensive demonstrations, limiting their applicability in semi-structured environments.

Methods: In this study, we propose an innovative Object-Embodiment-Centric Imitation and Residual Reinforcement Learning (OEC-IRRL) approach that leverages an object-embodiment-centric (OEC) task representation to integrate vision models with imitation and residual learning. By utilizing a single demonstration and minimizing interactions with the environment, our method aims to enhance learning efficiency and effectiveness. The proposed method involves three key steps: creating an object-embodiment-centric task representation, employing imitation learning for a base policy using via-point movement primitives for generalization to different settings, and utilizing residual RL for uncertainty-aware policy refinement during the assembly phase.

Results: Through a series of comprehensive experiments, we investigate the impact of the OEC task representation on base and residual policy learning and demonstrate the effectiveness of the method in semi-structured environments. Our results indicate that the approach, requiring only a single demonstration and less than 1.2 h of interaction, improves success rates by 46% and reduces assembly time by 25%.

Discussion: This research presents a promising avenue for robotic assembly tasks, providing a viable solution without the need for specialized expertise or custom fixtures.

Keywords: imitation learning; object-embodiment-centric task representation; residual reinforcement learning; robotic assembly; semi-structured environment.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by the following programs: the National Key Research and Development Program of China (Grant No. 2021YFB3301400), the National Natural Science Foundation of China (Grant No. 52305105), and the Basic and Applied Basic Research Foundation of Guangdong Province (Grant Nos. 2022A1515240027 and 2023A1515010812).