Mixed Receptive Fields Augmented YOLO with Multi-Path Spatial Pyramid Pooling for Steel Surface Defect Detection

Sensors (Basel). 2023 May 27;23(11):5114. doi: 10.3390/s23115114.

Abstract

Aiming at the problems of low detection efficiency and poor detection accuracy caused by texture feature interference and dramatic changes in the scale of defect on steel surfaces, an improved YOLOv5s model is proposed. In this study, we propose a novel re-parameterized large kernel C3 module, which enables the model to obtain a larger effective receptive field and improve the ability of feature extraction under complex texture interference. Moreover, we construct a feature fusion structure with a multi-path spatial pyramid pooling module to adapt to the scale variation of steel surface defects. Finally, we propose a training strategy that applies different kernel sizes for feature maps of different scales so that the receptive field of the model can adapt to the scale changes of the feature maps to the greatest extent. The experiment on the NEU-DET dataset shows that our model improved the detection accuracy of crazing and rolled in-scale, which contain a large number of weak texture features and are densely distributed by 14.4% and 11.1%, respectively. Additionally, the detection accuracy of inclusion and scratched defects with prominent scale changes and significant shape features was improved by 10.5% and 6.6%, respectively. Meanwhile, the mean average precision value reaches 76.8%, compared with the YOLOv5s and YOLOv8s, which increased by 8.6% and 3.7%, respectively.

Keywords: YOLOv5s; mixed receptive fields; multi-path spatial pyramid pooling; re-parameterized conv; steel surface defect detection.