BGCP-based traffic data imputation and accident detection applications for the national trunk highway

Accid Anal Prev. 2023 Jun:186:107051. doi: 10.1016/j.aap.2023.107051. Epub 2023 Apr 3.

Abstract

Facing the currently large quantity of intelligent transportation data, missing ones is often inevitable. Some previous works have shown the advantages of tensor decomposition-based approaches in solving multi-dimensional data imputation problems. However, a research gap still exists in examining the effect of applying these methods on imputation performance and their application to accident detection. Thus, referring to a two-month spatiotemporal traffic speed dataset, collected on the national trunk highway in Shandong, China, this paper employs the Bayesian Gaussian CANDECOMP/PARAFAC (BGCP) to impute missing speed data in different missing rates and missing scenarios. Moreover, the dataset is built while considering both the temporal and the road functions. Applying the generated results of data imputation in accident detection is also of the main targets of this work. Thus, while combining multiple sources of data, such as traffic operation status and weather, eXtreme Gradient Boosting (XGBoost) is deployed to build accident detection models. The generated results show that the BGCP model can produce accurate imputations even under temporally correlated data corruption. Added to that, it is also suggested that, when there are continuous periods of missing speed data (missing rate greater than 10%), pre-processing of data imputation is imperative to maintain the accuracy of accident detection. Thus, the objective of this work is to provide insights into traffic management and academics when performing spatiotemporal data imputation tasks.

Keywords: Accident detection; BGCP; Missing data imputation; Spatiotemporal traffic data; XGBoost.

MeSH terms

  • Accidents, Traffic* / prevention & control
  • Bayes Theorem
  • China
  • Humans
  • Transportation*
  • Weather