Speech-Based Activity Recognition for Trauma Resuscitation

IEEE Int Conf Healthc Inform. 2020 Nov-Dec:2020:10.1109/ichi48887.2020.9374372. doi: 10.1109/ichi48887.2020.9374372. Epub 2021 Mar 12.

Abstract

We present a speech-based approach to recognize team activities in the context of trauma resuscitation. We first analyzed the audio recordings of trauma resuscitations in terms of activity frequency, noise-level, and activity-related keyword frequency to determine the dataset characteristics. We next evaluated different audio-preprocessing parameters (spectral feature types and audio channels) to find the optimal configuration. We then introduced a novel neural network to recognize the trauma activities using a modified VGG network that extracts features from the audio input. The output of the modified VGG network is combined with the output of a network that takes keyword text as input, and the combination is used to generate activity labels. We compared our system with several baselines and performed a detailed analysis of the performance results for specific activities. Our results show that our proposed architecture that uses Mel-spectrum spectral coefficients features with a stereo channel and activity-specific frequent keywords achieve the highest accuracy and average F1-score.

Keywords: activity recognition; audio classification; keyword; speech processing; trauma resuscitation.