Evaluating gross primary productivity over 9 ChinaFlux sites based on random forest regression models, remote sensing, and eddy covariance data

Sci Total Environ. 2023 Jun 1:875:162601. doi: 10.1016/j.scitotenv.2023.162601. Epub 2023 Mar 5.

Abstract

Accurate modeling of Gross Primary Productivity (GPP) in terrestrial ecosystems is a major challenge in quantifying the carbon cycle. Many light use efficiency (LUE) models have been developed, but the variables and algorithms used for environmental constraints in different models vary importantly. It is still unclear whether the models can be further improved by machine learning methods and the combination of different variables. Here, we have developed a series of RFR-LUE models, which used the random forest regression (RFR) algorithm based on variables of LUE models, to explore the potential of estimating site-level GPP. Based on remote sensing indices, eddy covariance and meteorological data, we applied RFR-LUE models to evaluate the effects of different variables combined on GPP on daily, 8-day, 16-day and monthly scales, respectively. Cross-validation analyses revealed performances of RFR-LUE models varied significantly among sites with R2 of 0.52-0.97. Slopes of the regression relationship between simulated and observed GPP ranged from 0.59 to 0.95. Most models performed better in capturing the temporal changes and magnitude of GPP in mixed forests and evergreen needle-leaf forests than in evergreen broadleaf forests and grasslands. Performances were improved at the longer temporal scale, with the average R2 for four-time resolutions of 0.81, 0.87, 0.88, and 0.90, respectively. Additionally, the importance of the variables showed that temperature and vegetation indices were critical variables for RFR-LUE models, followed by radiation and moisture variables. The importance of moisture variables was higher in non-forests than in forests. A comparison with four GPP products indicated that RFR-LUE model predicted GPP better matcher observed GPP across sites. The study provided an approach to deriving GPP fluxes and evaluating the extent to which variables affect GPP estimation. It may be used for predicting vegetation GPP at the regional scales and for calibration and evaluation of land surface process models.

Keywords: Gross primary production; Light use efficiency (LUE) model; Random forest regression (RFR) model; Vegetation indices; Water indices.