Potential of visible and near infrared spectroscopy coupled with machine learning for predicting soil metal concentrations at the regional scale

Sci Total Environ. 2022 Oct 1:841:156582. doi: 10.1016/j.scitotenv.2022.156582. Epub 2022 Jun 14.

Abstract

Chemical analytical methods for metal analysis in soils are laborious, time-consuming and costly. This paper aims to evaluate the potential of short-range (SR) and full-range (FR) visible and infrared spectroscopy (vis-NIR) combined with linear and nonlinear calibration methods to estimate concentrations of nickel (Ni), cobalt (Co), cadmium (Cd), lead (Pb) and copper (Cu) in soils. A total of 435 soil samples were collected over agricultural sites, forest (7 %), pasture (5 %) and fallow land across a region in the northern part of Belgium. Generally, better predictions were obtained when using partial least squares regression (PLSR) and nonlinear calibration method [i.e., random forest (RF)] for processing of the spectral data, than when using support vector machine (SVM). FR generally outperformed SR and provided the best prediction results for Ni (R2p = 0.76), Co (R2p = 0.77), Cd (R2p = 0.64) and Pb (R2p = 0.65), when using PLSR and RF. SVM produced the best prediction result only for Pb (R2p = 0.57) using the SR spectra. The metals Ni, Co, Cd and Pb can be predicted successfully (good accuracy) from the FR vis-NIR spectra using PLSR for Co, and RF for Ni, Cd, Pb and Cu. Compared to the FR spectrophotometer, improvement in accuracy was obtained for Cd and Co, using the SR spectra when combined with PLSR and RF, respectively. It is concluded that the SR spectrometer can be used successfully for the prediction of Co with RF (R2p = 0.70), while it best predicted Cd with PLSR with an R2p value of 0.67, which is of value for regional survey.

Keywords: Chemometrics; Machine learning modelling; Metals; Near-infrared spectra; Soil contamination.

MeSH terms

  • Cadmium / analysis
  • Lead / analysis
  • Nickel / analysis
  • Soil Pollutants* / analysis
  • Soil* / chemistry
  • Spectroscopy, Near-Infrared
  • Support Vector Machine

Substances

  • Soil
  • Soil Pollutants
  • Cadmium
  • Lead
  • Nickel