Crash injury severity analysis using a two-layer Stacking framework

Jinjun Tang; Jian Liang; Chunyang Han; Zhibin Li; Helai Huang

doi:10.1016/j.aap.2018.10.016

Crash injury severity analysis using a two-layer Stacking framework

Accid Anal Prev. 2019 Jan:122:226-238. doi: 10.1016/j.aap.2018.10.016. Epub 2018 Nov 1.

Authors

Jinjun Tang¹, Jian Liang¹, Chunyang Han¹, Zhibin Li², Helai Huang¹

Affiliations

¹ School of Traffic and Transportation Engineering, Smart Transport Key Laboratory of Hunan Province, Central South University, Changsha, 410075, China.
² School of Transportation, Southeast University, Nanjing, 210096, China. Electronic address: lizhibin@seu.edu.cn.

PMID: 30390518
DOI: 10.1016/j.aap.2018.10.016

Abstract

Crash injury severity analysis is useful for traffic management agency to further understand severity of crashes. A two-layer Stacking framework is proposed in this study to predict the crash injury severity: The fist layer integrates advantages of three base classification methods: RF (Random Forests), AdaBoost (Adaptive Boosting), and GBDT (Gradient Boosting Decision Tree); the second layer completes classification of crash injury severity based on a Logistic Regression model. A total of 5538 crashes were recorded at 326 freeway diverge areas. In the model calibration, several parameters including the number of trees in three base classification methods, learning rate, and regularization coefficient are optimized via a systematic grid search approach. In the model validation, the performance of the Stacking model is compared with several traditional models including the Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests (RF) in the multi classification experiments. The prediction results show that Stacking model achieves superior performance evaluated by two indicators: accuracy and recall. Furthermore, all the factors used in severity prediction are classified into different categories according to their influence on the results, and sensitivity analysis of several significant factors is finally implemented to explore the impact of their value variation on the prediction accuracy.

Keywords: Adaptive Boosting; Crash injury severity; Gradient Boosting Decision Tree; Random Forests; Severity classification; Stacking model.

MeSH terms

Accidents, Traffic / classification*
Accidents, Traffic / statistics & numerical data
Built Environment / statistics & numerical data*
Decision Trees
Humans
Injury Severity Score*
Logistic Models
Neural Networks, Computer
Support Vector Machine
Wounds and Injuries / classification*
Wounds and Injuries / epidemiology