A hybrid clustering and classification approach for predicting crash injury severity on rural roads

Int J Inj Contr Saf Promot. 2018 Mar;25(1):85-101. doi: 10.1080/17457300.2017.1341933. Epub 2017 Jul 10.

Abstract

As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.

Keywords: Traffic crash severity prediction; clustering; decision trees; genetic algorithm.

Publication types

  • Comparative Study

MeSH terms

  • Accidents, Traffic / classification*
  • Accidents, Traffic / statistics & numerical data*
  • Adolescent
  • Adult
  • Aged
  • Algorithms*
  • Child
  • Child, Preschool
  • Cluster Analysis
  • Developing Countries / statistics & numerical data*
  • Environment Design
  • Female
  • Forecasting / methods
  • Humans
  • Infant
  • Infant, Newborn
  • Iran / epidemiology
  • Male
  • Middle Aged
  • Models, Statistical*
  • Neural Networks, Computer
  • Rural Population
  • Support Vector Machine
  • Trauma Severity Indices
  • Wounds and Injuries / epidemiology*
  • Young Adult