Feature selection methods for characterizing and classifying adaptive Sustainable Flood Retention Basins

Water Res. 2011 Jan;45(3):993-1004. doi: 10.1016/j.watres.2010.10.006. Epub 2010 Oct 16.

Abstract

The European Union's Flood Directive 2007/60/EC requires member states to produce flood risk maps for all river basins and coastal areas at risk of flooding by 2013. As a result, flood risk assessments have become an urgent challenge requiring a range of rapid and effective tools and approaches. The Sustainable Flood Retention Basin (SFRB) concept has evolved to provide a rapid assessment technique for impoundments, which have a pre-defined or potential role in flood defense and diffuse pollution control. A previous version of the SFRB survey method developed by the co-author Scholz in 2006 recommends gathering of over 40 variables to characterize an SFRB. Collecting all these variables is relatively time-consuming and more importantly, these variables are often correlated with each other. Therefore, the objective is to explore the correlation among these variables and find the most important variables to represent an SFRB. Three feature selection techniques (Information Gain, Mutual Information and Relief) were applied on the SFRB data set to identify the importance of the variables in terms of classification accuracy. Four benchmark classifiers (Support Vector Machine, K-Nearest Neighbours, C4.5 Decision Tree and Naïve Bayes) were subsequently used to verify the effectiveness of the classification with the selected variables and automatically identify the optimal number of variables. Experimental results indicate that our proposed approach provides a simple, rapid and effective framework for variable selection and SFRB classification. Only nine important variables are sufficient to accurately classify SFRB. Finally, six typical cases were studied to verify the performance of the identified nine variables on different SFRB types. The findings provide a rapid scientific tool for SFRB assessment in practice. Moreover, the generic value of this tool allows also for its wide application in other areas.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Floods*
  • Models, Theoretical
  • Risk Assessment