When does stratification of a subtropical soil spectral library improve predictions of soil organic carbon content?

Sci Total Environ. 2020 Oct 1:737:139895. doi: 10.1016/j.scitotenv.2020.139895. Epub 2020 Jun 5.

Abstract

More accurate models for the prediction of soil organic carbon (SOC) by visible-near-infrared (Vis-NIR) spectroscopy remains a challenging task, especially when the soil spectral libraries (SSL) is composed of soils with a high pedological variation. One proposition to increase the models accuracy is to reduce the SSL variance, which can be achieved by stratifying the library into sub-libraries. Thus, the main objective of this study was to evaluate whether the stratification of a SSL by environmental, pedological and Vis-NIR spectral criteria results in greater accuracy of spectroscopic models than to general models for prediction of SOC content. The performance of the models was evaluated considering the variance of soil components and sample number. In addition, we tested the effect of two spectral preprocessing techniques and two multivariate calibration methods on spectroscopic modeling. For these purposes, a SSL composed of 2471 samples from Southern Brazil was stratified based on i) physiographic region; ii) land-use/land-cover; iii) soil texture, and iv) spectral class. Two spectral processing techniques: Savitzky-Golay - 1st derivative (SGD) and continuum removed reflectance (CRR) and two multivariate methods (partial least squares regression - PLSR and Cubist) were used to fit the models. The best performances for the global and local models were achieved with the CRR spectral processing associated with the Cubist method. The stratification of the SSL in more homogeneous sample groups by environmental criteria (physiographic regions and land-use/land-cover) improved the accuracy of SOC predictions compared to pedological (soil texture) and Vis-NIR spectral (spectral classes) criteria. The reduction in the number of samples negatively affected the performance of models for sub-libraries with high pedological and spectral variation. Stratification criteria were proposed in a flowchart to assist in decision making in future studies. Our findings suggest the importance of sample balance across environmental, pedological and spectral strata, in order to optimize SOC predictions.

Keywords: Environmental-based learning; SOC variance; Spectral models; Spectral variation; Vis-NIR data mining.