Splitting Gaussian processes for computationally-efficient regression

Nick Terry; Youngjun Choe

doi:10.1371/journal.pone.0256470

Splitting Gaussian processes for computationally-efficient regression

PLoS One. 2021 Aug 24;16(8):e0256470. doi: 10.1371/journal.pone.0256470. eCollection 2021.

Authors

Nick Terry¹, Youngjun Choe¹

Affiliation

¹ Department of Industrial and Systems Engineering, University of Washington, Seattle, WA, United States of America.

Abstract

Gaussian processes offer a flexible kernel method for regression. While Gaussian processes have many useful theoretical properties and have proven practically useful, they suffer from poor scaling in the number of observations. In particular, the cubic time complexity of updating standard Gaussian process models can be a limiting factor in applications. We propose an algorithm for sequentially partitioning the input space and fitting a localized Gaussian process to each disjoint region. The algorithm is shown to have superior time and space complexity to existing methods, and its sequential nature allows the model to be updated efficiently. The algorithm constructs a model for which the time complexity of updating is tightly bounded above by a pre-specified parameter. To the best of our knowledge, the model is the first local Gaussian process regression model to achieve linear memory complexity. Theoretical continuity properties of the model are proven. We demonstrate the efficacy of the resulting model on several multi-dimensional regression tasks.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Normal Distribution

Grants and funding

Y.C. was awarded two grants from the National Science Foundation (NSF) for this project. See https://www.nsf.gov/. Specifically, funds from the grants CMMI-1824681 and DMS-1952781 were used for this research work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.