Estimation of sodium adsorption ratio indicator using data mining methods: a case study in Urmia Lake basin, Iran

Environ Sci Pollut Res Int. 2018 Feb;25(5):4776-4786. doi: 10.1007/s11356-017-0844-y. Epub 2017 Dec 2.

Abstract

Water quality is a major concern around the world, particularly in dry climates. Usually, assessment of surface water quality is costly and time-consuming. In this situation, a method which could estimate the water quality accurately with the minimum of hydro-chemical parameters would be appealing. In this study, three data mining methods, namely, M5 model tree, support vector machine (SVM), and Gaussian process (GP), were employed to estimate the sodium adsorption ratio (SAR) indicator in the Shahrchay River located in the west of the Urmia Lake basin, Iran. Results from these methods were compared with an artificial neural network (ANN). Different hydro-chemical parameters were assessed and the most effective parameters were selected. Five combinations of the selected parameters were developed as input parameters to the models. The results indicated that the M5 model tree has a superior performance among the data mining methods, where the combination of sodium and electrical conductivity (Na and EC) is used as input parameters, with a coefficient of determination (R2) = 0.987, root mean squared error (RMSE) = 0.017, mean absolute error (MAE) = 0.012, and mean relative error (MRE) = 5.584. Also, a sensitivity analysis was carried out which reported that the SAR is more sensitive to Na, Ca, and EC, respectively. This research highlights that the M5 model tree can be successfully employed for the estimation of SAR. It also indicates that the practical and simple linear equations and optimization performed with the M5 model tree reduce time and cost.

Keywords: Gaussian process (GP); M5 model tree; Sodium adsorption ratio; Support vector machine (SVM); Urmia Lake basin; Water quality.

MeSH terms

  • Adsorption
  • Agricultural Irrigation / standards*
  • Data Mining
  • Desert Climate
  • Environmental Monitoring / methods*
  • Iran
  • Lakes / chemistry*
  • Neural Networks, Computer
  • Rivers / chemistry*
  • Sodium / analysis*
  • Support Vector Machine
  • Water Quality / standards

Substances

  • Sodium