A boundedness result for the direct heuristic dynamic programming

Feng Liu; Jian Sun; Jennie Si; Wentao Guo; Shengwei Mei

doi:10.1016/j.neunet.2012.02.005

A boundedness result for the direct heuristic dynamic programming

Neural Netw. 2012 Aug:32:229-35. doi: 10.1016/j.neunet.2012.02.005. Epub 2012 Feb 14.

Authors

Feng Liu¹, Jian Sun, Jennie Si, Wentao Guo, Shengwei Mei

Affiliation

¹ Department of Electrical Engineering, Tsinghua University, Beijing, 100084, PR China.

PMID: 22397949
DOI: 10.1016/j.neunet.2012.02.005

Abstract

Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programming (direct HDP) is one of the ADP algorithms inspired by the adaptive critic designs. It has been shown applicable to industrial scale, realistic and complex control problems. In this paper, we provide a uniformly ultimately boundedness (UUB) result for the direct HDP learning controller under mild and intuitive conditions. By using a Lyapunov approach we show that the estimation errors of the learning parameters or the weights in the action and critic networks remain UUB. This result provides a useful controller convergence guarantee for the first time for the direct HDP design.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Artificial Intelligence*
Neural Networks, Computer
Neurons / physiology
Online Systems
Programming, Linear*