Examining the impacts of crash data aggregation on SPF estimation

Accid Anal Prev. 2021 Sep:160:106313. doi: 10.1016/j.aap.2021.106313. Epub 2021 Aug 5.

Abstract

The American Association of State Highway and Transportation Officials' Highway Safety Manual (HSM) includes a collection of safety performance functions (SPFs) or statistical models to estimate the expected crash frequency of roadway segments, intersections, and interchanges. These models are applied in several steps of the safety management process, including to screen the road network for opportunities to improve safety and to evaluate the performance of safety countermeasure deployments. The SPFs in the HSM are generally estimated using negative binomial regression modeling. In some instances, they are estimated using annual crash frequency and site-specific (e.g., traffic volume) data, while in other instances they are estimated using aggregate crash frequency and site-specific data. This paper explores the differences that result from estimating SPFs using aggregate versus disaggregate data using the same methods as those used to estimate the SPFs in the HSM. A synthetic dataset was first used to conduct these comparisons - these data were generated in a manner that is consistent with the properties of the negative binomial distribution. Then, an observational dataset from Pennsylvania was used to compare the SPFs from both aggregate and disaggregate data. The results show that SPFs estimated using the panel (disaggregate) data and aggregated data provide similar model coefficients, although some differences may sometimes arise. However, the overdispersion parameter obtained using each dataset can differ significantly. These differences result in systematic biases in calculations of expected crash frequency when Empirical Bayes adjustments are applied, which - as the paper demonstrates - could lead to different outcomes in a network screening exercise. Overall, these results reveal that aggregating crash data might result in biased SPF outputs and lead to inconsistent Empirical Bayes adjustments.

Keywords: Data aggregation; Empirical Bayes adjustments; Negative binomial regression; Safety performance functions; Synthetic data.

MeSH terms

  • Accidents, Traffic / prevention & control
  • Bayes Theorem
  • Data Aggregation*
  • Environment Design*
  • Humans
  • Models, Statistical
  • Safety
  • Safety Management