Bird Object Detection: Dataset Construction, Model Performance Evaluation, and Model Lightweighting

Yang Wang; Jiaogen Zhou; Caiyun Zhang; Zhaopeng Luo; Xuexue Han; Yanzhu Ji; Jihong Guan

doi:10.3390/ani13182924

Bird Object Detection: Dataset Construction, Model Performance Evaluation, and Model Lightweighting

Animals (Basel). 2023 Sep 14;13(18):2924. doi: 10.3390/ani13182924.

Authors

Yang Wang^{1

2}, Jiaogen Zhou², Caiyun Zhang², Zhaopeng Luo³, Xuexue Han³, Yanzhu Ji⁴, Jihong Guan¹

Affiliations

¹ Department of Computer Science and Technology, Tongji University, Shanghai 201804, China.
² Jiangsu Province Engineering Research Center for Intelligent Monitoring and Management of Small Water Bodies, Huaiyin Normal University, Huaian 223300, China.
³ Huai'an City Zoo, Huaian 223300, China.
⁴ Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.

Abstract

The application of object detection technology has a positive auxiliary role in advancing the intelligence of bird recognition and enhancing the convenience of bird field surveys. However, challenges arise due to the absence of dedicated bird datasets and evaluation benchmarks. To address this, we have not only constructed the largest known bird object detection dataset, but also compared the performances of eight mainstream detection models on bird object detection tasks and proposed feasible approaches for model lightweighting in bird object detection. Our constructed bird detection dataset of GBDD1433-2023, includes 1433 globally common bird species and 148,000 manually annotated bird images. Based on this dataset, two-stage detection models like Faster R-CNN and Cascade R-CNN demonstrated superior performances, achieving a Mean Average Precision (mAP) of 73.7% compared to one-stage models. In addition, compared to one-stage object detection models, two-stage object detection models have a stronger robustness to variations in foreground image scaling and background interference in bird images. On bird counting tasks, the accuracy ranged between 60.8% to 77.2% for up to five birds in an image, but this decreased sharply beyond that count, suggesting limitations of object detection models in multi-bird counting tasks. Finally, we proposed an adaptive localization distillation method for one-stage lightweight object detection models that are suitable for offline deployment, which improved the performance of the relevant models. Overall, our work furnishes an enriched dataset and practice guidelines for selecting suitable bird detection models.

Keywords: adaptive localization distillation; bird counting; bird monitoring; model lightweighting; object detection.

Abstract

Grants and funding