Bayesian optimization with unknown constraints in graphical skill models for compliant manipulation tasks using an industrial robot

Volker Gabler; Dirk Wollherr

doi:10.3389/frobt.2022.993359

Bayesian optimization with unknown constraints in graphical skill models for compliant manipulation tasks using an industrial robot

Front Robot AI. 2022 Oct 14:9:993359. doi: 10.3389/frobt.2022.993359. eCollection 2022.

Authors

Volker Gabler¹, Dirk Wollherr¹

Affiliation

¹ All authors are with the Chair of Automatic Control Engineering, TUM School of Computation, Information and Technology, Technical University of Munich, Munich, Germany.

Abstract

This article focuses on learning manipulation skills from episodic reinforcement learning (RL) in unknown environments using industrial robot platforms. These platforms usually do not provide the required compliant control modalities to cope with unknown environments, e.g., force-sensitive contact tooling. This requires designing a suitable controller, while also providing the ability of adapting the controller parameters from collected evidence online. Thus, this work extends existing work on meta-learning for graphical skill-formalisms. First, we outline how a hybrid force-velocity controller can be applied to an industrial robot in order to design a graphical skill-formalism. This skill-formalism incorporates available task knowledge and allows for online episodic RL. In contrast to the existing work, we further propose to extend this skill-formalism by estimating the success probability of the task to be learned by means of factor graphs. This method allows assigning samples to individual factors, i.e., Gaussian processes (GPs) more efficiently and thus allows improving the learning performance, especially at early stages, where successful samples are usually only drawn in a sparse manner. Finally, we propose suitable constraint GP models and acquisition functions to obtain new samples in order to optimize the information gain, while also accounting for the success probability of the task. We outline a specific application example on the task of inserting the tip of a screwdriver into a screwhead with an industrial robot and evaluate our proposed extension against the state-of-the-art methods. The collected data outline that our method allows artificial agents to obtain feasible samples faster than existing approaches, while achieving a smaller regret value. This highlights the potential of our proposed work for future robotic applications.

Keywords: Bayesian optimization; compliant manipulation; episodic reinforcement learning; robot learning and control; safe learning.