Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming

Dimitri P Bertsekas

doi:10.1109/TNNLS.2015.2503980

Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming

IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):500-509. doi: 10.1109/TNNLS.2015.2503980. Epub 2015 Dec 22.

Author

Dimitri P Bertsekas

PMID: 28055911
DOI: 10.1109/TNNLS.2015.2503980

Abstract

In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general assumptions, we establish the uniqueness of the solution of Bellman's equation, and we provide convergence results for value and policy iterations.