A novel attention-based enhancement framework for face mask detection in complicated scenarios

Hongyi Zhang; Jun Tang; Peishu Wu; Han Li; Nianyin Zeng

doi:10.1016/j.image.2023.116985

A novel attention-based enhancement framework for face mask detection in complicated scenarios

Signal Process Image Commun. 2023 Aug:116:116985. doi: 10.1016/j.image.2023.116985. Epub 2023 Apr 24.

Authors

Hongyi Zhang¹, Jun Tang¹, Peishu Wu², Han Li², Nianyin Zeng²

Affiliations

¹ School of Opto-electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China.
² Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China.

Abstract

In the context of COVID-19 pandemic prevention and control, it is of vital significance to realize accurate face mask detection via computer vision technique. In this paper, a novel attention improved Yolo (AI-Yolo) model is proposed, which can handle existing challenges in the complicated real-world scenarios with dense distribution, small-size object detection and interference of similar occlusions. In particular, a selective kernel (SK) module is set to achieve convolution domain soft attention mechanism with split, fusion and selection operations; a spatial pyramid pooling (SPP) module is applied to enhance the expression of local and global features, which enriches the receptive field information; and a feature fusion (FF) module is utilized to promote sufficient fusions of multi-scale features from each resolution branch, which adopts basic convolution operators without excessive computational complexity. In addition, the complete intersection over union (CIoU) loss function is adopted in the training stage for accurate positioning. Experiments are carried out on two challenging public face mask detection datasets, and the results demonstrate the superiority of the proposed AI-Yolo against other seven state-of-the-art object detection algorithms, which achieves the best results in terms of mean average precision and F1 score on both datasets. Furthermore, effectiveness of the meticulously designed modules in AI-Yolo is validated through extensive ablation studies. In a word, the proposed AI-Yolo is competent to accomplish face mask detection tasks under extremely complex situations with precise localization and accurate classification.

Keywords: Attention mechanism; Computer vision; Face occlusion detection; Multi-scale feature fusion.