Predicting Grape Sugar Content under Quality Attributes Using Normalized Difference Vegetation Index Data and Automated Machine Learning

Sensors (Basel). 2022 Apr 23;22(9):3249. doi: 10.3390/s22093249.

Abstract

Wine grapes need frequent monitoring to achieve high yields and quality. Non-destructive methods, such as proximal and remote sensing, are commonly used to estimate crop yield and quality characteristics, and spectral vegetation indices (VIs) are often used to present site-specific information. Analysis of laboratory samples is the most popular method for determining the quality characteristics of grapes, although it is time-consuming and expensive. In recent years, several machine learning-based methods have been developed to predict crop quality. Although these techniques require the extensive involvement of experts, automated machine learning (AutoML) offers the possibility to improve this task, saving time and resources. In this paper, we propose an innovative approach for robust prediction of grape quality attributes by combining open-source AutoML techniques and Normalized Difference Vegetation Index (NDVI) data for vineyards obtained from four different platforms-two proximal vehicle-mounted canopy reflectance sensors, orthomosaics from UAV images and Sentinel-2 remote sensing imagery-during the 2019 and 2020 growing seasons. We investigated AutoML, extending our earlier work on manually fine-tuned machine learning methods. Results of the two approaches using Ordinary Least Square (OLS), Theil-Sen and Huber regression models and tree-based methods were compared. Support Vector Machines (SVMs) and Automatic Relevance Determination (ARD) were included in the analysis and different combinations of sensors and data collected over two growing seasons were investigated. Results showed promising performance of Unmanned Aerial Vehicle (UAV) and Spectrosense+ GPS data in predicting grape sugars, especially in mid to late season with full canopy growth. Regression models with both manually fine-tuned ML (R² = 0.61) and AutoML (R² = 0.65) provided similar results, with the latter slightly improved for both 2019 and 2020. When combining multiple sensors and growth stages per year, the coefficient of determination R² improved even more averaging 0.66 for the best-fitting regressions. Also, when considering combinations of sensors and growth stages across both cropping seasons, UAV and Spectrosense+ GPS, as well as Véraison and Flowering, each had the highest average R² values. These performances are consistent with previous work on machine learning algorithms that were manually fine-tuned. These results suggest that AutoML has greater long-term performance potential. To increase the efficiency of crop quality prediction, a balance must be struck between manual expert work and AutoML.

Keywords: AutoML; Bayesian optimization; NDVI; correlation; ensemble methods; quality prediction; sugars.

MeSH terms

  • Farms
  • Machine Learning
  • Remote Sensing Technology* / methods
  • Sugars
  • Vitis*

Substances

  • Sugars