A Data Augmentation-Based Evaluation System for Regional Direct Economic Losses of Storm Surge Disasters

Int J Environ Res Public Health. 2021 Mar 12;18(6):2918. doi: 10.3390/ijerph18062918.

Abstract

The accurate prediction of storm surge disasters' direct economic losses plays a positive role in providing critical support for disaster prevention decision-making and management. Previous researches on storm surge disaster loss assessment did not pay much attention to the overfitting phenomenon caused by the data scarcity and the excessive model complexity. To solve these problems, this paper puts forward a new evaluation system for forecasting the regional direct economic loss of storm surge disasters, consisting of three parts. First of all, a comprehensive assessment index system was established by considering the storm surge disasters' formation mechanism and the corresponding risk management theory. Secondly, a novel data augmentation technique, k-nearest neighbor-Gaussian noise (KNN-GN), was presented to overcome data scarcity. Thirdly, an ensemble learning algorithm XGBoost as a regression model was utilized to optimize the results and produce the final forecasting results. To verify the best-combined model, KNN-GN-based XGBoost, we conducted cross-contrast experiments with several data augmentation techniques and some widely-used ensemble learning models. Meanwhile, the traditional prediction models are used as baselines to the optimized forecasting system. The experimental results show that the KNN-GN-based XGBoost model provides more precise predictions than the traditional models, with a 64.1% average improvement in the mean absolute percentage error (MAPE) measurement. It could be noted that the proposed evaluation system can be extended and applied to the geography-related field as well.

Keywords: KNN-GN; XGBoost; data augmentation; economic losses; storm surge.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Disasters*
  • Forecasting