Optimizing machine learning algorithms for spatial prediction of gully erosion susceptibility with four training scenarios

Environ Sci Pollut Res Int. 2023 Apr;30(16):46979-46996. doi: 10.1007/s11356-022-25090-2. Epub 2023 Feb 3.

Abstract

Gully erosion causes high soil erosion rates and is an environmental concern posing major risk to the sustainability of cultivated areas of the world. Gullies modify the land, shape new landforms, and damage agricultural fields. Gully erosion mapping is essential to understand the mechanism, development, and evolution of gullies. In this work, a new modeling approach was employed for gully erosion susceptibility mapping (GESM) in the Golestan Dam basin of Iran. The measurements of 14 gully erosion (GE) factors at 1042 GE locations were compiled in a spatial database. Four training datasets comprised of 100%, 75%, 50%, and 25% of the entire database were used for modeling and validation (for each data set in the common 70:30 ratio). Four machine learning models-maximum entropy (MaxEnt), general linear model (GLM), support vector machine (SVM), and artificial neural network (ANN)- were employed to check the usefulness of the four training scenarios. The results of random forest (RF) analysis indicated that the most important GE effective factors were distance from the stream, elevation, distance from the road, and vertical distance of the channel network (VDCN). The receiver operating characteristic (ROC) was used to validate the results. Our study showed that the sample size influenced the performance of the four machine learning algorithms. However, the ANN had a lower sensitivity to the reduction of sample size. In addition, validation results revealed that ANN (AUROC = 0.85.7-0.90.4%) had the best performance based on all four sample data sets. The results of this research can be useful and valuable guidelines for choosing machine learning methods when a complete gully inventory is not available in a region.

Keywords: GIS; Gully erosion; Iran; Machine learning.

MeSH terms

  • Conservation of Natural Resources / methods
  • Databases, Factual
  • Geographic Information Systems*
  • Machine Learning
  • Soil*

Substances

  • Soil