WE3DS: An RGB-D Image Dataset for Semantic Segmentation in Agriculture

Sensors (Basel). 2023 Mar 1;23(5):2713. doi: 10.3390/s23052713.

Abstract

Smart farming (SF) applications rely on robust and accurate computer vision systems. An important computer vision task in agriculture is semantic segmentation, which aims to classify each pixel of an image and can be used for selective weed removal. State-of-the-art implementations use convolutional neural networks (CNN) that are trained on large image datasets. In agriculture, publicly available RGB image datasets are scarce and often lack detailed ground-truth information. In contrast to agriculture, other research areas feature RGB-D datasets that combine color (RGB) with additional distance (D) information. Such results show that including distance as an additional modality can improve model performance further. Therefore, we introduce WE3DS as the first RGB-D image dataset for multi-class plant species semantic segmentation in crop farming. It contains 2568 RGB-D images (color image and distance map) and corresponding hand-annotated ground-truth masks. Images were taken under natural light conditions using an RGB-D sensor consisting of two RGB cameras in a stereo setup. Further, we provide a benchmark for RGB-D semantic segmentation on the WE3DS dataset and compare it with a solely RGB-based model. Our trained models achieve up to 70.7% mean Intersection over Union (mIoU) for discriminating between soil, seven crop species, and ten weed species. Finally, our work confirms the finding that additional distance information improves segmentation quality.

Keywords: RGB-D; crop farming; image dataset; semantic segmentation; stereo vision; weed detection.