RAAWC-UNet: an apple leaf and disease segmentation method based on residual attention and atrous spatial pyramid pooling improved UNet with weight compression loss

Front Plant Sci. 2024 Mar 11:15:1305358. doi: 10.3389/fpls.2024.1305358. eCollection 2024.

Abstract

Introduction: Early detection of leaf diseases is necessary to control the spread of plant diseases, and one of the important steps is the segmentation of leaf and disease images. The uneven light and leaf overlap in complex situations make segmentation of leaves and diseases quite difficult. Moreover, the significant differences in ratios of leaf and disease pixels results in a challenge in identifying diseases.

Methods: To solve the above issues, the residual attention mechanism combined with atrous spatial pyramid pooling and weight compression loss of UNet is proposed, which is named RAAWC-UNet. Firstly, weights compression loss is a method that introduces a modulation factor in front of the cross-entropy loss, aiming at solving the problem of the imbalance between foreground and background pixels. Secondly, the residual network and the convolutional block attention module are combined to form Res_CBAM. It can accurately localize pixels at the edge of the disease and alleviate the vanishing of gradient and semantic information from downsampling. Finally, in the last layer of downsampling, the atrous spatial pyramid pooling is used instead of two convolutions to solve the problem of insufficient spatial context information.

Results: The experimental results show that the proposed RAAWC-UNet increases the intersection over union in leaf and disease segmentation by 1.91% and 5.61%, and the pixel accuracy of disease by 4.65% compared with UNet.

Discussion: The effectiveness of the proposed method was further verified by the better results in comparison with deep learning methods with similar network architectures.

Keywords: ASPP; CBAM; Resnet; apple leaf and disease; weight compress.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was funded partly by the Doctoral Foundation of Henan Polytechnic University under Grant B2022-15.