Similar to the role of Markov decision processes in reinforcement learning, Markov games (also called stochastic games) lay down the foundation for the study of multi-agent reinforcement learning and sequential agent interactions. We introduce approximate Markov perfect equilibrium as a solution to the computational problem of finite-state stochastic games repeated in the infinite horizon and prove its PPAD-completeness. This solution concept preserves the Markov perfect property and opens up the possibility for the success of multi-agent reinforcement learning algorithms on static two-player games to be extended to multi-agent dynamic games, expanding the reign of the PPAD-complete class.
Keywords: Markov game; Markov perfect equilibrium; PPAD-completeness; multi-agent reinforcement learning; stochastic game.
© The Author(s) 2022. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd.