Land subsidence modelling using tree-based machine learning algorithms

Sci Total Environ. 2019 Jul 1:672:239-252. doi: 10.1016/j.scitotenv.2019.03.496. Epub 2019 Apr 2.

Abstract

Land subsidence (LS) is among the most critical environmental problems, affecting both agricultural sustainability and urban infrastructure. Existing methods often use either simple regression models or complex hydraulic models to explain and predict LS. There are few studies that identify the risk factors and predict the risk of LS using machine learning models. This study compares four tree-based machine learning models for land subsidence hazard modelling at a study area in Hamadan plain (Iran). The study also analyzes the importance of six risk factors including topography (elevation, slope), geomorphology (distance from stream, drainage density), hydrology (groundwater drawdown) and lithology on LS. Thematic layers of each variable related to the LS phenomenon are prepared and utilized as the inputs to the four tree-based machine learning models, including the Rule-Based Decision Tree (RBDT), Boosted Regression Trees (BRT), Classification And Regression Tree (CART), and the Random Forest (RF) algorithms to produce a consolidated LS hazard map. The accuracy of the generated maps is then evaluated using the area under the receiver operating characteristic curve (AUC) and the True Skill Statistics (TSS). The RF approach had the lowest predictive error for mapping the LS hazard (i.e., AUC 96.7% for training, AUC 93.8% for validation, TSS 0.912 for training, TSS 0.904 for validation) followed by BRT. Groundwater drawdown was seen to be the most influential factor that contributed to land subsidence in the present study area, followed by lithology and distance from the stream network.

Keywords: Artificial intelligence; Environmental management; GIS; Hazard; Spatial analysis.