Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: Application of the simulated annealing feature selection method

Sci Total Environ. 2020 Apr 1:711:135161. doi: 10.1016/j.scitotenv.2019.135161. Epub 2019 Nov 21.

Abstract

Flash-floods are increasingly recognized as a frequent natural hazard worldwide. Iran has been among the mostdevastated regions affected by the major floods. While the temporal flash-flood forecasting models are mainly developed for warning systems, the models for assessing hazardous areas can greatly contribute to adaptation and mitigation policy-making and disaster risk reduction. Former researches in the flash-flood hazard mapping have heightened the urge for the advancement of more accurate models. Thus, the current research proposes the state-of-the-art ensemble models of boosted generalized linear model (GLMBoost) and random forest (RF), and Bayesian generalized linear model (BayesGLM) methods for higher performance modeling. Furthermore, a pre-processing method, namely simulated annealing (SA), is used to eliminate redundant variables from the modeling process. Results of the modeling based on the hit and miss analysis indicates high performance for both models (accuracy = 90-92%, Kappa = 79-84%, Success ratio = 94-96%, Threat score = 80-84%, and Heidke skill score = 79-84%). The variables of distance from the stream, vegetation, drainage density, land use, and elevation have shown more contribution among others for modeling the flash-flood. The results of this study can significantly facilitate mapping the hazardous areas and further assist watershed managers to control and remediate induced damages of flood in the data-scarce regions.

Keywords: Bayesian; Ensemble machine learning; Flash-flood; Hazard; Simulated annealing.