Prediction of effluent arsenic concentration of wastewater treatment plants using machine learning and kriging-based models

Environ Sci Pollut Res Int. 2022 Mar;29(14):20556-20570. doi: 10.1007/s11356-021-16916-6. Epub 2021 Nov 5.

Abstract

This study evaluates the potential of kriging-based (kriging and kriging-logistic) and machine learning models (MARS, GBRT, and ANN) in predicting the effluent arsenic concentration of a wastewater treatment plant. Two distinct input combination scenarios were established, using seven quantitative and qualitative independent influent variables. In the first scenario, all of the seven independent variables were taken into account for constructing the data-driven models. For the second input scenario, the forward selection k-fold cross-validation method was employed to select effective explanatory influent parameters. The results obtained from both input scenarios show that the kriging-logistic and machine learning models are effective and robust. However, using the feature selection procedure in the second scenario not only made the architecture of the model simpler and more effective, but also enhanced the performance of the developed models (e.g., around 7.8% performance enhancement of the RMSE). Although the standard kriging method provided the least good predictive results (RMSE = 0.18 ug/l and NSE=0.75), it was revealed that the kriging-logistic method gave the best performance among the applied models (RMSE = 0.11 ug/l and NSE=0.90).

Keywords: Arsenic mitigation; Artificial neural networks; Gradient boosted regression trees; Kriging-logistic; Multivariate adaptive regression splines.

MeSH terms

  • Arsenic*
  • Machine Learning
  • Spatial Analysis
  • Water Purification*

Substances

  • Arsenic