Application of data mining algorithms for improving stress prediction of automobile drivers: A case study in Jordan

Comput Biol Med. 2019 Nov:114:103474. doi: 10.1016/j.compbiomed.2019.103474. Epub 2019 Sep 28.

Abstract

Driving daily through traffic congestion has been recognised as a major cause of stress. High levels of stress while driving negatively impact the driver's decisions which could potentially lead to accidents and other long-term health hazards. Accordingly, there is a great need to determine stress levels for drivers based on measuring and predicting the major causes (features or classes) that increase stress levels. In this paper, the problem of predicting automobile drivers' stress levels, as experienced during actual driving, is investigated through the application of five different data mining algorithms, namely K-Nearest Neighbour (KNN), Decision Tree (J48), Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Networks (ANN). An experiment was conducted on 14 drivers taking various routes in Amman - Jordan, with a wearable biomedical device attached to the driver to instantly collect physiological data. The collected data (dataset) is grouped into two different categories, namely 'Yes' to signify the presence of stress and 'No' to signify the absence of stress. In order to efficiently apply data mining algorithms to the data set, oversampling was used to avoid the negative effect of driver samples with a lesser class on the prediction of stress. The findings are evaluated in relation to stress prediction and accordingly contrasted alongside standard reference approaches that do not consider oversampling and/or feature selection using the Friedman rank test. The proposed approach, in combination with RF, was seen to surpass any others in terms of accuracy, AUC, specificity, and sensitivity. The accuracy, AUC, specificity, and sensitivity rates produced by RF utilising our proposed approach were 98.92%, 99.91%, 98.46%, and 99.36%, respectively.

Keywords: Data mining algorithms; Feature selection; Oversampling; Stress prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Algorithms*
  • Automobile Driving / psychology*
  • Data Mining / methods*
  • Decision Trees
  • Electrodiagnosis
  • Female
  • Humans
  • Jordan
  • Male
  • Middle Aged
  • Monitoring, Physiologic
  • Neural Networks, Computer
  • Signal Processing, Computer-Assisted
  • Stress, Psychological / diagnosis*
  • Support Vector Machine
  • Vital Signs
  • Young Adult