Real-time accident detection: Coping with imbalanced data

Accid Anal Prev. 2019 Aug:129:202-210. doi: 10.1016/j.aap.2019.05.014. Epub 2019 Jun 3.

Abstract

Detecting accidents is of great importance since they often impose significant delay and inconvenience to road users. This study compares the performance of two popular machine learning models, Support Vector Machine (SVM) and Probabilistic Neural Network (PNN), to detect the occurrence of accidents on the Eisenhower expressway in Chicago. Accordingly, since the detection of accidents should be as rapid as possible, seven models are trained and tested for each machine learning technique, using traffic condition data from 1 to 7 min after the actual occurrence. The main sources of data used in this study consist of weather condition, accident, and loop detector data. Furthermore, to overcome the problem of imbalanced data (i.e., underrepresentation of accidents in the dataset), the Synthetic Minority Oversampling TEchnique (SMOTE) is used. The results show that although SVM achieves overall higher accuracy, PNN outperforms SVM regarding the Detection Rate (DR) (i.e., percentage of correct accident detections). In addition, while both models perform best at 5 min after the occurrence of accidents, models trained at 3 or 4 min after the occurrence of an accident detect accidents more rapidly while performing reasonably well. Lastly, a sensitivity analysis of PNN for Time-To-Detection (TTD) reveals that the speed difference between upstream and downstream of accidents location is particularly significant to detect the occurrence of accidents.

Keywords: Accident detection; Machine learning; Probabilistic neural network; Real-time data; Support vector machine.

MeSH terms

  • Accidents, Traffic / statistics & numerical data*
  • Chicago
  • Humans
  • Neural Networks, Computer*
  • Support Vector Machine*
  • Time Factors
  • Weather