Estimation of the soil arsenic concentration using a geographically weighted XGBoost model based on hyperspectral data

Sci Total Environ. 2023 Feb 1;858(Pt 1):159798. doi: 10.1016/j.scitotenv.2022.159798. Epub 2022 Oct 26.

Abstract

Considering the high toxicity of arsenic (As), its contamination of soil represents an alarming environmental and public health issue. Existing soil heavy metal concentration estimation models based on hyperspectral data ignore the spatial nonstationarity of the relationship between the soil spectrum and heavy metal concentration. A novel model (geographically weighted eXtreme gradient boosting or GW-XGBoost model) combining geographically weighted regression (GWR) method with XGBoost algorithm was proposed. The northeast district of Beijing, China, was chosen as a case study area to assess the effectiveness of the proposed model. The GW-XGBoost model was established to estimate the As concentration based on the typical spectrum of As and the spatial correlation between the spectrum and As concentration obtained using the GWR method, and the result was compared to that obtained with the XGBoost and GWR models. The accuracy of the GW-XGBoost model was obviously better than that of the other models (R2GW-XGBoost = 0.90, R2XGBoost = 0.48, and R2GWR = 0.74). Therefore, the proposed model is reliable, as it considers the spatial correlation between the spectrum and As concentration.

Keywords: Geographically weighted XGBoost model; Hyperspectral data; Soil arsenic concentration; Spatial nonstationarity.

MeSH terms

  • Arsenic*
  • China
  • Environmental Monitoring / methods
  • Metals, Heavy*
  • Soil
  • Spatial Regression

Substances

  • Soil
  • Arsenic
  • Metals, Heavy