Spatial Factor Models for High-Dimensional and Large Spatial Data: An Application in Forest Variable Mapping

Stat Sin. 2019:29:1155-1180. doi: 10.5705/ss.202018.0005.

Abstract

Gathering information about forest variables is an expensive and arduous activity. As such, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next generation collection initiatives of remotely sensed Light Detection and Ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to make use of the former to model, predict, and map forest variables over regions of interest. This entails dealing with the high-dimensional (~102) spatially dependent LiDAR outcomes over a large number of locations (~105-106). With this in mind, we develop the Spatial Factor Nearest Neighbor Gaussian Process (SF-NNGP) model, and embed it in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of forest variables with associated uncertainty over a large region of boreal forests in interior Alaska.

Keywords: LiDAR data; forest outcomes; nearest neighbor Gaussian processes; spatial prediction.