Factors affecting injury severity in motorcycle crashes: Different age groups analysis using Catboost and SHAP techniques

Traffic Inj Prev. 2024;25(3):472-481. doi: 10.1080/15389588.2023.2297168. Epub 2024 Jan 23.

Abstract

Objective: Motorcycle crashes often result in severe injuries on roads that affect people's lives physically, financially, and psychologically. These injuries could be notably harmful to drivers of all age groups. The main objective of this study is to investigate the risk factors contributing to the severity of crash injuries in different age groups.

Methods: This Objective is achieved by developing accurate machine learning (ML) based prediction models. This research examines the relationship between potential risk factors of motorcycle-associated crashes using (ML) and Shapley Additive explanations (SHAP) technique. The SHAP technique further helped interpreting ML methods for traffic injury severity prediction. It indicates the significant non-linear interactions between dependent and independent variables. The data for this study was collected from the Provincial Emergency Response Service RESCUE 1122 for the Rawalpindi region (Pakistan) over three years (from 2017 to 2020). The Synthetic Minority Oversampling Technique (SMOTE) is employed to balance injury severity classes in the pre-processing phase.

Results: The results demonstrate that age, gender, posted speed limit, the number of lanes, and month of the year are positively associated with severe and fatal injuries. This research also assesses how the modeling framework varies between the ML and classical statistical methods. The predictive performance of proposed ML models was assessed using several evaluation metrics, and it is found that Catboost outperformed the XGBoost, Random Forest (RF) and Multinomial Logit (MNL) model.

Conclusion: The findings of this study will assist road users, road safety authorities, stakeholders, policymakers, and decision-makers in obtaining substantial and essential guidance for reducing the severity of crash injuries in Pakistan and other countries with prevailing conditions.

Keywords: Catboost; Machine learning (ML); injury severity; shapley additive explanations (SHAP); traffic safety.

MeSH terms

  • Accidents, Traffic
  • Emergency Medical Services*
  • Humans
  • Logistic Models
  • Motorcycles
  • Risk Factors
  • Wounds and Injuries* / epidemiology