Decomposing user-defined tasks in a reinforcement learning setup using TextWorld

Front Robot AI. 2023 Dec 22:10:1280578. doi: 10.3389/frobt.2023.1280578. eCollection 2023.

Abstract

The current paper proposes a hierarchical reinforcement learning (HRL) method to decompose a complex task into simpler sub-tasks and leverage those to improve the training of an autonomous agent in a simulated environment. For practical reasons (i.e., illustrating purposes, easy implementation, user-friendly interface, and useful functionalities), we employ two Python frameworks called TextWorld and MiniGrid. MiniGrid functions as a 2D simulated representation of the real environment, while TextWorld functions as a high-level abstraction of this simulated environment. Training on this abstraction disentangles manipulation from navigation actions and allows us to design a dense reward function instead of a sparse reward function for the lower-level environment, which, as we show, improves the performance of training. Formal methods are utilized throughout the paper to establish that our algorithm is not prevented from deriving solutions.

Keywords: autonomous agents; formal methods in robotics and automation; hierarchical reinforcement learning; reinforcement learning; task and motion planning.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was funded by the “Study, Design, Development, and Implementation of a Holistic System for Upgrading the Quality of Life and Activity of the Elderly” (MIS 5047294), which is implemented under the Action “Support for Regional Excellence,” funded by the Operational Program “Competitiveness, Entrepreneurship and Innovation” (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund).