How to describe organic contamination in soils: A model comparison for chlorinated solvent concentrations at industrial site scale

Sci Total Environ. 2018 Aug 15:633:1480-1495. doi: 10.1016/j.scitotenv.2018.03.257. Epub 2018 Apr 3.

Abstract

The heavy-tailed distribution of the data of organic pollution in soils can raise specific problems in estimating and mapping the concentrations. Some high values often highly impact the sample variogram and extend the pollution hot-spots on the estimation maps. Non-linear geostatistical models, such as the anamorphosed Gaussian model, have been proposed in the 70's. They allow a consistent estimate of the concentrations and the probability that the concentrations exceed a cut-off. These well-founded methods are rarely used by environmental consultants, mainly because of time constraints and because the hypotheses of the models are not always satisfied. To estimate the concentrations, an empirical method widely used by environmental consultants consists of truncating the high values to gain robustness in the variogram analysis. The truncation value is arbitrary, even if it has a strong influence on the estimates of the concentrations. Proposed to handle heavy-tailed distributions of ore grades, the top-cut model (Rivoirard et al., 2013) justifies the use of truncated values but corrects the underestimation of the mean caused by truncation. In this model, the decomposition of the variable into three components (the truncated value, a weighted indicator at the top-cut threshold and a residual) makes the variographic study more robust and guides choosing the top-cut threshold. In the case of a chlorinated solvent contamination, a detailed comparison between several estimation methods is performed: ordinary kriging, kriging after truncation of the highest concentrations and estimation within the top-cut model, with structured or pure nugget residual. A sensitivity study to the top-cut threshold is realized. The results of two implementations of the cross-validation are compared. The top-cut model with nugget residual appears to be robust, even if the hypotheses of the model are not perfectly satisfied.

Keywords: Geostatistics; Heavy-tailed distribution; Kriging; Soil contamination modelling.