Composing Synergistic Macro Actions for Reinforcement Learning Agents

IEEE Trans Neural Netw Learn Syst. 2024 May;35(5):7251-7258. doi: 10.1109/TNNLS.2022.3213606. Epub 2024 May 2.

Abstract

Macro actions have been demonstrated to be beneficial for the learning processes of an agent and have encouraged a variety of techniques to be developed for constructing more effective ones. However, previous techniques usually do not further consider combining macro actions to form a synergistic macro action ensemble, in which synergism exhibits when the constituent macro actions are favorable to be jointly used by an agent during evaluation. Such a synergistic macro action ensemble may potentially allow an agent to perform even better than the individual macro actions within it. Motivated by the recent advances of neural architecture search (NAS), in this brief, we formulate the construction of a synergistic macro action ensemble as a Markov decision process (MDP) and evaluate the constructed macro action ensemble as a whole. Such a problem formulation enables synergism to be taken into account by the proposed evaluation procedure. Our experimental results demonstrate that the proposed framework is able to discover the synergistic macro action ensembles. Furthermore, we also highlight the benefits of these macro action ensembles through a set of analytical cases.