Data-centric annotation analysis for plant disease detection: Strategy, consistency, and performance

Front Plant Sci. 2022 Dec 7:13:1037655. doi: 10.3389/fpls.2022.1037655. eCollection 2022.

Abstract

Object detection models have become the current tool of choice for plant disease detection in precision agriculture. Most existing research improved the performance by ameliorating networks and optimizing the loss function. However, because of the vast influence of data annotation quality and the cost of annotation, the data-centric part of a project also needs more investigation. We should further consider the relationship between data annotation strategies, annotation quality, and the model's performance. In this paper, a systematic strategy with four annotation strategies for plant disease detection is proposed: local, semi-global, global, and symptom-adaptive annotation. Labels with different annotation strategies will result in distinct models' performance, and their contrasts are remarkable. An interpretability study of the annotation strategy is conducted by using class activation maps. In addition, we define five types of inconsistencies in the annotation process and investigate the severity of the impact of inconsistent labels on model's performance. Finally, we discuss the problem of label inconsistency during data augmentation. Overall, this data-centric quantitative analysis helps us to understand the significance of annotation strategies, which provides practitioners a way to obtain higher performance and reduce annotation costs on plant disease detection. Our work encourages researchers to pay more attention to annotation consistency and the essential issues of annotation strategy. The code will be released at: https://github.com/JiuqingDong/PlantDiseaseDetection_Yolov5 .

Keywords: annotation strategy; data-centric; inconsistent bounding box; noisy labels; plant disease detection.

Grants and funding

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (No. 2019R1A6A1A09031717); by the Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry(IPET) and Korea Smart Farm R&D Foundation(KosFarm) through Smart Farm Innovation Technology Development Program, funded by Ministry of Agriculture, Food and Rural Affairs(MAFRA) and Ministry of Science and ICT(MSIT), Rural Development Administration(RDA)(421005-04); and by the National Research Foundation of Korea(NRF) grant funded by the Korea government (MSIT). (NRF-2021R1A2C1012174).