Integrated machine learning methods with resampling algorithms for flood susceptibility prediction

Sci Total Environ. 2020 Feb 25:705:135983. doi: 10.1016/j.scitotenv.2019.135983. Epub 2019 Dec 6.

Abstract

Flood susceptibility projections relying on standalone models, with one-time train-test data splitting for model calibration, yields biased results. This study proposed novel integrative flood susceptibility prediction models based on multi-time resampling approaches, random subsampling (RS) and bootstrapping (BT) algorithms, integrated with machine learning models: generalized additive model (GAM), boosted regression tree (BTR) and multivariate adaptive regression splines (MARS). RS and BT algorithms provided 10 runs of data resampling for learning and validation of the models. Then the mean of 10 runs of predictions is used to produce the flood susceptibility maps (FSM). This methodology was applied to Ardabil Province on coastal margins of the Caspian Sea which faced destructive floods. The area under curve (AUC) of receiver operating characteristic (ROC) and true skill statistic (TSS) and correlation coefficient (COR) were utilized to evaluate the predictive accuracy of the proposed models. Results demonstrated that resampling algorithms improved the performance of Standalone GAM, MARS and BRT models. Results also revealed that Standalone models had better performance with the BT algorithm compared to the RS algorithm. BT-GAM model attained superior performance in terms of statistical measures (AUC = 0.98, TSS = 0.93, COR = 0.91), followed by BT-MARS (AUC = 0.97, TSS = 0.91, COR = 0.91) and BT-BRT model (AUC = 0.95, TSS = 0.79, COR = 0.79). Results demonstrated that the proposed models outperformed the benchmark models such as Standalone GAM, MARS, BRT, multilayer perceptron (MLP) and support vector machine (SVM). Given the admirable performance of the proposed models in a large scale area, the promising results can be expected from these models for other regions.

Keywords: Bootstrapping; Flood susceptibility; Machine learning; Random subsampling; Resampling approach.