Investigation on the use of ensemble learning and big data in crop identification

Heliyon. 2023 Jan 31;9(2):e13339. doi: 10.1016/j.heliyon.2023.e13339. eCollection 2023 Feb.

Abstract

The agriculture sector in Egypt faces several problems, such as climate change, water storage, and yield variability. The comprehensive capabilities of Big Data (BD) can help in tackling the uncertainty of food supply occurs due to several factors such as soil erosion, water pollution, climate change, socio-cultural growth, governmental regulations, and market fluctuations. Crop identification and monitoring plays a vital role in modern agriculture. Although several machine learning models have been utilized in identifying crops, the performance of ensemble learning has not been investigated extensively. The massive volume of satellite imageries has been established as a big data problem forcing to deploy the proposed solution using big data technologies to manage, store, analyze, and visualize satellite data. In this paper, we have developed a weighted voting mechanism for improving crop classification performance in a large scale, based on ensemble learning and big data schema. Built upon Apache Spark, the popular DB Framework, the proposed approach was tested on El Salheya, Ismaili governate. The proposed ensemble approach boosted accuracy by 6.5%, 1.9%, 4.4%, 4.9%, 4.7% in precision, recall, F-score, Overall Accuracy (OA), and Matthews correlation coefficient (MCC) metrics respectively. Our findings confirm the generalization of the proposed crop identification approach at a large-scale setting.

Keywords: Apache spark; Big data; Crop identification; DB Framework; Ensemble learning.