Reward Maximization Through Discrete Active Inference

Lancelot Da Costa; Noor Sajid; Thomas Parr; Karl Friston; Ryan Smith

doi:10.1162/neco_a_01574

Reward Maximization Through Discrete Active Inference

Neural Comput. 2023 Apr 18;35(5):807-852. doi: 10.1162/neco_a_01574.

Authors

Lancelot Da Costa¹, Noor Sajid², Thomas Parr³, Karl Friston⁴, Ryan Smith⁵

Affiliations

¹ Department of Mathematics, Imperial College London, London SW7 2AZ, U.K. l.da-costa@imperial.ac.uk.
² Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3AR, U.K. noor.sajid.18@ucl.ac.uk.
³ Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3AR, U.K. thomas.parr.12@ucl.ac.uk.
⁴ Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3AR, U.K. k.friston@ucl.ac.uk.
⁵ Laureate Institute for Brain Research, Tulsa, OK 74136, U.S.A. rsmith@laureateinstitute.org.

PMID: 36944240
DOI: 10.1162/neco_a_01574

Abstract

Active inference is a probabilistic framework for modeling the behavior of biological and artificial agents, which derives from the principle of minimizing free energy. In recent years, this framework has been applied successfully to a variety of situations where the goal was to maximize reward, often offering comparable and sometimes superior performance to alternative approaches. In this article, we clarify the connection between reward maximization and active inference by demonstrating how and when active inference agents execute actions that are optimal for maximizing reward. Precisely, we show the conditions under which active inference produces the optimal solution to the Bellman equation, a formulation that underlies several approaches to model-based reinforcement learning and control. On partially observed Markov decision processes, the standard active inference scheme can produce Bellman optimal actions for planning horizons of 1 but not beyond. In contrast, a recently developed recursive active inference scheme (sophisticated inference) can produce Bellman optimal actions on any finite temporal horizon. We append the analysis with a discussion of the broader relationship between active inference and reinforcement learning.

Reward Maximization Through Discrete Active Inference

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding