Features Split and Aggregation Network for Camouflaged Object Detection

Zejin Zhang; Tao Wang; Jian Wang; Yao Sun

doi:10.3390/jimaging10010024

Features Split and Aggregation Network for Camouflaged Object Detection

J Imaging. 2024 Jan 18;10(1):24. doi: 10.3390/jimaging10010024.

Authors

Zejin Zhang¹, Tao Wang¹, Jian Wang^{1

2}, Yao Sun^{1

2}

Affiliations

¹ HDU-ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou 310018, China.
² School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China.

PMID: 38249009
DOI: 10.3390/jimaging10010024

Abstract

Higher standards have been proposed for detection systems since camouflaged objects are not distinct enough, making it possible to ignore the difference between their background and foreground. In this paper, we present a new framework for Camouflaged Object Detection (COD) named FSANet, which consists mainly of three operations: spatial detail mining (SDM), cross-scale feature combination (CFC), and hierarchical feature aggregation decoder (HFAD). The framework simulates the three-stage detection process of the human visual mechanism when observing a camouflaged scene. Specifically, we have extracted five feature layers using the backbone and divided them into two parts with the second layer as the boundary. The SDM module simulates the human cursory inspection of the camouflaged objects to gather spatial details (such as edge, texture, etc.) and fuses the features to create a cursory impression. The CFC module is used to observe high-level features from various viewing angles and extracts the same features by thoroughly filtering features of various levels. We also design side-join multiplication in the CFC module to avoid detail distortion and use feature element-wise multiplication to filter out noise. Finally, we construct an HFAD module to deeply mine effective features from these two stages, direct the fusion of low-level features using high-level semantic knowledge, and improve the camouflage map using hierarchical cascade technology. Compared to the nineteen deep-learning-based methods in terms of seven widely used metrics, our proposed framework has clear advantages on four public COD datasets, demonstrating the effectiveness and superiority of our model.

Keywords: bio-inspired network; camouflaged object detection; context-aware features; multi-scale features.