CDYL for infrared and visible light image dense small object detection

Sci Rep. 2024 Feb 12;14(1):3510. doi: 10.1038/s41598-024-54146-1.

Abstract

To address the phenomenon of many small and hard-to-detect objects in infrared and visible light images, we propose an object detection algorithm CDYL (Convolution to Fully Connect-ed-Deformable Convolution You only Look once) based on the CFC-DC (Convolution to Fully Connected-Deformable Convolution) module. The core operator of CDYL is CFC-DC, making our model not only have a large effective receptive field in infrared and visible light images, but also have adaptive spatial aggregation conditioned by input and task information. As a result, the CDYL reduces the strict inductive bias of traditional CNNs and has long-range dependence for large kernel convolution as well as adaptive spatial aggregation, deeply mining of edge and correlation information in images to enhance sensitivity to small objects, thereby improving performance in dense small object detection tasks. In order to improve the ability of the CFC-DC module to perceive the detailed information of the image, we use the Mish activation function, which has a wider minima which improves the generalization. The effectiveness as well as the generalization of CDYL is evaluated on an infrared image dataset and an UAV image dataset, and it is compared with other state-of-the-art object detection algorithms. Compared to the baseline network YOLOv8l, our model achieved a 3.0% improvement in mAP0.5 in infrared image detection tasks and a 1.1% improvement in mAP0.5 in visible light image detection tasks. The experimental results show that the proposed algorithm achieves superior average precision values on both infrared and visible light images, while maintaining a light weight. Code is publicly available at https://github.com/yangzhu1/CDYL .

Keywords: Computer vision and FliR_Adas_v2 dataset; Infrared and visible light; Object detection; YOLOv8.