Evaluation of Random Forests (RF) for Regional and Local-Scale Wheat Yield Prediction in Southeast Australia

Sensors (Basel). 2022 Jan 18;22(3):717. doi: 10.3390/s22030717.

Abstract

Wheat accounts for more than 50% of Australia's total grain production. The capability to generate accurate in-season yield predictions is important across all components of the agricultural value chain. The literature on wheat yield prediction has motivated the need for more novel works evaluating machine learning techniques such as random forests (RF) at multiple scales. This research applied a Random Forest Regression (RFR) technique to build regional and local-scale yield prediction models at the pixel level for three southeast Australian wheat-growing paddocks, each located in Victoria (VIC), New South Wales (NSW) and South Australia (SA) using 2018 yield maps from data supplied by collaborating farmers. Time-series Normalized Difference Vegetation Index (NDVI) data derived from Planet's high spatio-temporal resolution imagery, meteorological variables and yield data were used to train, test and validate the models at pixel level using Python libraries for (a) regional-scale three-paddock composite and (b) individual paddocks. The composite region-wide RF model prediction for the three paddocks performed well (R2 = 0.86, RMSE = 0.18 t ha-1). RF models for individual paddocks in VIC (R2 = 0.89, RMSE = 0.15 t ha-1) and NSW (R2 = 0.87, RMSE = 0.07 t ha-1) performed well, but moderate performance was seen for SA (R2 = 0.45, RMSE = 0.25 t ha-1). Generally, high values were underpredicted and low values overpredicted. This study demonstrated the feasibility of applying RF modeling on satellite imagery and yielded 'big data' for regional as well as local-scale yield prediction.

Keywords: Normalized Difference Vegetation Index (NDVI); random forests; satellite imagery; wheat; yield prediction.

MeSH terms

  • Australia
  • Meteorology
  • Satellite Imagery*
  • Seasons
  • Triticum*