Approximation of smooth functionals using deep ReLU networks

Linhao Song; Ying Liu; Jun Fan; Ding-Xuan Zhou

doi:10.1016/j.neunet.2023.07.012

Approximation of smooth functionals using deep ReLU networks

Neural Netw. 2023 Sep:166:424-436. doi: 10.1016/j.neunet.2023.07.012. Epub 2023 Jul 18.

Authors

Linhao Song¹, Ying Liu², Jun Fan³, Ding-Xuan Zhou⁴

Affiliations

¹ School of Mathematical Science, Beihang University, Beijing, China; School of Data Science, City University of Hong Kong, Kowloon, Hong Kong. Electronic address: linhasong2-c@my.cityu.edu.hk.
² Laboratory for AI-Powered Financial Technologies, Hong Kong Science Park, Shatin, New Territories, Hong Kong. Electronic address: yingliu@hkaift.com.
³ Department of Mathematics, Hong Kong Baptist University, Kowloon, Hong Kong. Electronic address: junfan@hkbu.edu.hk.
⁴ School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia. Electronic address: dingxuan.zhou@sydney.edu.au.

PMID: 37549610
DOI: 10.1016/j.neunet.2023.07.012

Abstract

In recent years, deep neural networks have been employed to approximate nonlinear continuous functionals F defined on L^p([-1,1]^s) for 1≤p≤∞. However, the existing theoretical analysis in the literature either is unsatisfactory due to the poor approximation results, or does not apply to the rectified linear unit (ReLU) activation function. This paper aims to investigate the approximation power of functional deep ReLU networks in two settings: F is continuous with restrictions on the modulus of continuity, and F has higher order Fréchet derivatives. A novel functional network structure is proposed to extract features of higher order smoothness harbored by the target functional F. Quantitative rates of approximation in terms of the depth, width and total number of weights of neural networks are derived for both settings. We give logarithmic rates when measuring the approximation error on the unit ball of a Hölder space. In addition, we establish nearly polynomial rates (i.e., rates of the form exp-a(logM)^b with a>0,0<b<1) when measuring the approximation error on a space of analytic functions.

Keywords: Approximation theory; Deep learning theory; Fréchet derivative; Polynomial rates; ReLU; Smooth functionals.

MeSH terms

Algorithms*
Neural Networks, Computer*