Assessing the impacts of enriched information on crash prediction performance

Accid Anal Prev. 2019 Jan:122:162-171. doi: 10.1016/j.aap.2018.10.004. Epub 2018 Oct 28.

Abstract

While high road safety performing countries base their effective strategies on reliable data, in developing countries the unavailability of essential information makes this task challenging. As a result, this drawback has led researchers and planners to face dilemmas of "doing nothing" or "doing ill", therefore restricting models to data availability, often limited to socio-economic and demographic variables. Taking this into account, this study aims to demonstrate the potential improvements in spatial crash prediction model performance by enhancing the explanatory variables and modelling casualties as a function of a more comprehensive dataset, especially with an appropriate exposure variable. This includes experimental work, where models based on available information from São Paulo, Brazil, and Flanders, the Dutch speaking area of Belgium, are developed and compared with each other. Prediction models are developed within the framework of Geographically Weighted Regression with the Poisson distribution of errors. Moreover, casualties and fatalities as the response variables in the models developed for Flanders and São Paulo, respectively, are divided into two sets based on the transport mode, called active (i.e., pedestrians and cyclists) and motorized transport (i.e., motorized vehicle occupants). In order to assess the impacts of the enriched information on model performance, casualties are firstly associated with all available variables for São Paulo and the corresponding ones for Flanders. In the next step, prediction models are developed only for Flanders considering all the available information in the Flemish dataset. Findings showed that by adding the supplementary data, reductions of 20% and 25% for motorized transport, and 25% and 35% for active transport resulted in AICc and MSPE, respectively. Considering the practical aspects, results could help identify hotspots and relate most influential factors, suggesting sites and data, which should be prioritized in future local investigations. Besides minimizing costs with data collection, it could help policy makers to identify, implement and enforce appropriate countermeasures.

Keywords: Crash prediction models; Enriched data; Geographically Weighted Regression; Road safety.

MeSH terms

  • Accidents, Traffic / prevention & control
  • Accidents, Traffic / statistics & numerical data*
  • Automobile Driving / statistics & numerical data
  • Belgium
  • Brazil
  • Humans
  • Models, Statistical*
  • Probability
  • Safety