Development of a Low-Cost Distributed Computing Pipeline for High-Throughput Cotton Phenotyping

Sensors (Basel). 2024 Feb 2;24(3):970. doi: 10.3390/s24030970.

Abstract

In this paper, we present the development of a low-cost distributed computing pipeline for cotton plant phenotyping using Raspberry Pi, Hadoop, and deep learning. Specifically, we use a cluster of several Raspberry Pis in a primary-replica distributed architecture using the Apache Hadoop ecosystem and a pre-trained Tiny-YOLOv4 model for cotton bloom detection from our past work. We feed cotton image data collected from a research field in Tifton, GA, into our cluster's distributed file system for robust file access and distributed, parallel processing. We then submit job requests to our cluster from our client to process cotton image data in a distributed and parallel fashion, from pre-processing to bloom detection and spatio-temporal map creation. Additionally, we present a comparison of our four-node cluster performance with centralized, one-, two-, and three-node clusters. This work is the first to develop a distributed computing pipeline for high-throughput cotton phenotyping in field-based agriculture.

Keywords: big data; computer vision; cotton phenotyping; distributed computing.

MeSH terms

  • Electronic Data Processing
  • Gossypium*
  • Humans
  • Phenotype*