An Improved YOLOv5-Based Underwater Object-Detection Framework

Jian Zhang; Jinshuai Zhang; Kexin Zhou; Yonghui Zhang; Hongda Chen; Xinyue Yan

doi:10.3390/s23073693

An Improved YOLOv5-Based Underwater Object-Detection Framework

Sensors (Basel). 2023 Apr 3;23(7):3693. doi: 10.3390/s23073693.

Authors

Jian Zhang^{1

2}, Jinshuai Zhang², Kexin Zhou², Yonghui Zhang¹, Hongda Chen², Xinyue Yan²

Affiliations

¹ School of Information and Communication Engineering, Hainan University, Haikou 570228, China.
² School of Applied Science and Technology, Hainan University, Haikou 570228, China.

Abstract

To date, general-purpose object-detection methods have achieved a great deal. However, challenges such as degraded image quality, complex backgrounds, and the detection of marine organisms at different scales arise when identifying underwater organisms. To solve such problems and further improve the accuracy of relevant models, this study proposes a marine biological object-detection architecture based on an improved YOLOv5 framework. First, the backbone framework of Real-Time Models for object Detection (RTMDet) is introduced. The core module, Cross-Stage Partial Layer (CSPLayer), includes a large convolution kernel, which allows the detection network to precisely capture contextual information more comprehensively. Furthermore, a common convolution layer is added to the stem layer, to extract more valuable information from the images efficiently. Then, the BoT3 module with the multi-head self-attention (MHSA) mechanism is added into the neck module of YOLOv5, such that the detection network has a better effect in scenes with dense targets and the detection accuracy is further improved. The introduction of the BoT3 module represents a key innovation of this paper. Finally, union dataset augmentation (UDA) is performed on the training set using the Minimal Color Loss and Locally Adaptive Contrast Enhancement (MLLE) image augmentation method, and the result is used as the input to the improved YOLOv5 framework. Experiments on the underwater datasets URPC2019 and URPC2020 show that the proposed framework not only alleviates the interference of underwater image degradation, but also makes the mAP@0.5 reach 79.8% and 79.4% and improves the mAP@0.5 by 3.8% and 1.1%, respectively, when compared with the original YOLOv8 on URPC2019 and URPC2020, demonstrating that the proposed framework presents superior performance for the high-precision detection of marine organisms.

Keywords: CSPNeXt block; YOLOv5; bottleneck transformer; object detection.

Abstract

Grants and funding