TBC-YOLOv7: a refined YOLOv7-based algorithm for tea bud grading detection

Siyang Wang; Dasheng Wu; Xinyu Zheng

doi:10.3389/fpls.2023.1223410

TBC-YOLOv7: a refined YOLOv7-based algorithm for tea bud grading detection

Front Plant Sci. 2023 Aug 17:14:1223410. doi: 10.3389/fpls.2023.1223410. eCollection 2023.

Authors

Siyang Wang^{1

2

3}, Dasheng Wu^{1

2

3}, Xinyu Zheng^{1

2

3}

Affiliations

¹ College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, China.
² Key Laboratory of State Forestry and Grassland Administration on Forestry Sensing Technology and Intelligent Equipment, Hangzhou, China.
³ Key Laboratory of Forestry Intelligent Monitoring and Information Technology of Zhejiang, Hangzhou, China.

Abstract

Introduction: Accurate grading identification of tea buds is a prerequisite for automated tea-picking based on machine vision system. However, current target detection algorithms face challenges in detecting tea bud grades in complex backgrounds. In this paper, an improved YOLOv7 tea bud grading detection algorithm TBC-YOLOv7 is proposed.

Methods: The TBC-YOLOv7 algorithm incorporates the transformer architecture design in the natural language processing field, integrating the transformer module based on the contextual information in the feature map into the YOLOv7 algorithm, thereby facilitating self-attention learning and enhancing the connection of global feature information. To fuse feature information at different scales, the TBC-YOLOv7 algorithm employs a bidirectional feature pyramid network. In addition, coordinate attention is embedded into the critical positions of the network to suppress useless background details while paying more attention to the prominent features of tea buds. The SIOU loss function is applied as the bounding box loss function to improve the convergence speed of the network.

Result: The results of the experiments indicate that the TBC-YOLOv7 is effective in all grades of samples in the test set. Specifically, the model achieves a precision of 88.2% and 86.9%, with corresponding recall of 81% and 75.9%. The mean average precision of the model reaches 87.5%, 3.4% higher than the original YOLOv7, with average precision values of up to 90% for one bud with one leaf. Furthermore, the F1 score reaches 0.83. The model's performance outperforms the YOLOv7 model in terms of the number of parameters. Finally, the results of the model detection exhibit a high degree of correlation with the actual manual annotation results ( $R^{2}$ =0.89), with the root mean square error of 1.54.

Discussion: The TBC-YOLOv7 model proposed in this paper exhibits superior performance in vision recognition, indicating that the improved YOLOv7 model fused with transformer-style module can achieve higher grading accuracy on densely growing tea buds, thereby enables the grade detection of tea buds in practical scenarios, providing solution and technical support for automated collection of tea buds and the judging of grades.

Keywords: BiFPN; CA; SIoU; YOLOv7; contextual transformer; tea bud grading detection.

Grants and funding

This work was financially supported by the Zhejiang Forestry Science and Technology Project (Grant No.2023SY08), the National Natural Science Foundation of China (Grant No. 42001354), the Natural Science Foundation of Zhejiang Province (Grant No. LQ19D010011) and the research development fund project of Zhejiang A&F University (Grant No. 2018FR060)