On the Use of a Convolutional Block Attention Module in Deep Learning-Based Human Activity Recognition with Motion Sensors

Diagnostics (Basel). 2023 May 26;13(11):1861. doi: 10.3390/diagnostics13111861.

Abstract

Sensor-based human activity recognition with wearable devices has captured the attention of researchers in the last decade. The possibility of collecting large sets of data from various sensors in different body parts, automatic feature extraction, and aiming to recognize more complex activities have led to a rapid increase in the use of deep learning models in the field. More recently, using attention-based models for dynamically fine-tuning the model features and, in turn, improving the model performance has been investigated. However, the impact of using channel, spatial, or combined attention methods of the convolutional block attention module (CBAM) on the high-performing DeepConvLSTM model, a hybrid model proposed for sensor-based human activity recognition, has yet to be studied. Additionally, since wearables have limited resources, analysing the parameter requirements of attention modules can serve as an indicator for optimizing resource consumption. In this study, we explored the performance of CBAM on the DeepConvLSTM architecture both in terms of recognition performance and the number of additional parameters required by attention modules. In this direction, the effect of channel and spatial attention, individually and in combination, were examined. To evaluate the model performance, the Pamap2 dataset containing 12 daily activities and the Opportunity dataset with its 18 micro activities were utilized. The results showed that the performance for Opportunity increased from 0.74 to 0.77 in the macro f1-score owing to spatial attention, while for Pamap2, the performance increased from 0.95 to 0.96 owing to the channel attention applied to DeepConvLSTM with a negligible number of additional parameters. Moreover, when the activity-based results were analysed, it was observed that the attention mechanism increased the performance of the activities with the worst performance in the baseline model without attention. We present a comparison with related studies that use the same datasets and show that we could achieve higher scores on both datasets by combining CBAM and DeepConvLSTM.

Keywords: attention mechanism; channel attention; convolutional neural networks; human activity recognition; hybrid deep models; motion sensors; spatial attention.