Compositionally-warped Gaussian processes

Gonzalo Rios; Felipe Tobar

doi:10.1016/j.neunet.2019.06.012

Compositionally-warped Gaussian processes

Neural Netw. 2019 Oct:118:235-246. doi: 10.1016/j.neunet.2019.06.012. Epub 2019 Jul 4.

Authors

Gonzalo Rios¹, Felipe Tobar²

Affiliations

¹ Department of Mathematical Engineering, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile. Electronic address: grios@dim.uchile.cl.
² Department of Mathematical Engineering, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile; Center for Mathematical Modeling, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile. Electronic address: ftobar@dim.uchile.cl.

PMID: 31319321
DOI: 10.1016/j.neunet.2019.06.012

Abstract

The Gaussian process (GP) is a nonparametric prior distribution over functions indexed by time, space, or other high-dimensional index set. The GP is a flexible model yet its limitation is given by its very nature: it can only model Gaussian marginal distributions. To model non-Gaussian data, a GP can be warped by a nonlinear transformation (or warping) as performed by warped GPs (WGPs) and more computationally-demanding alternatives such as Bayesian WGPs and deep GPs. However, the WGP requires a numerical approximation of the inverse warping for prediction, which increases the computational complexity in practice. To sidestep this issue, we construct a novel class of warpings consisting of compositions of multiple elementary functions, for which the inverse is known explicitly. We then propose the compositionally-warped GP (CWGP), a non-Gaussian generative model whose expressiveness follows from its deep compositional architecture, and its computational efficiency is guaranteed by the analytical inverse warping. Experimental validation using synthetic and real-world datasets confirms that the proposed CWGP is robust to the choice of warpings and provides more accurate point predictions, better trained models and shorter computation times than WGP.

Keywords: Function compositions; Gaussian process; Neural networks; Non-Gaussian models; Warped Gaussian processes.

MeSH terms

Bayes Theorem
Neural Networks, Computer*
Normal Distribution