Coffee and cashew nut dataset: A dataset for detection, classification, and yield estimation for machine learning applications

Data Brief. 2023 Dec 14:52:109952. doi: 10.1016/j.dib.2023.109952. eCollection 2024 Feb.

Abstract

Conventional methods of crop yield estimation are costly, inefficient, and prone to error resulting in poor yield estimates. This affects the ability of farmers to appropriately plan and manage their crop production pipelines and market processes. There is therefore a need to develop automated methods of crop yield estimation. However, the development of accurate machine-learning methods for crop yield estimation depends on the availability of appropriate datasets. There is a lack of such datasets, especially in sub-Saharan Africa. We present curated image datasets of coffee and cashew nuts acquired in Uganda during two crop harvest seasons. The datasets were collected over nine months, from September 2022 to May 2023. The data was collected using a high-resolution camera mounted on an Unmanned Aerial Vehicle . The datasets contain 3000 coffee and 3086 cashew nut images, constituting 6086 images. Annotated objects of interest in the coffee dataset consist of five classes namely: unripe, ripening, ripe, spoilt, and coffee_tree. Annotated objects of interest in the cashew nut dataset consist of six classes namely: tree, flower, premature, unripe, ripe, and spoilt. The datasets may be used for various machine-learning tasks including flowering intensity estimation, fruit maturity stage analysis, disease diagnosis, crop variety identification, and yield estimation.

Keywords: Cashew apple; Coffee cherry; Image classification; Object detection; Yield estimation.