Machine learning algorithms for forecasting the incidence of Coffea arabica pests and diseases

Int J Biometeorol. 2020 Apr;64(4):671-688. doi: 10.1007/s00484-019-01856-1. Epub 2020 Jan 7.

Abstract

Disease and pest alert models are able to generate information for agrochemical applications only when needed, reducing costs and environmental impacts. With machine learning algorithms, it is possible to develop models to be used in disease and pest warning systems as a function of the weather in order to improve the efficiency of chemical control of pests of the coffee tree. Thus, we correlated the infection rates with the weather variables and also calibrated and tested machine learning algorithms to predict the incidence of coffee rust, cercospora, coffee miner, and coffee borer. We used weather and field data obtained from coffee plantations in production in the southern regions of the State of Minas Gerais (SOMG) and from the region of the Cerrado Mineiro; these crops did not receive phytosanitary treatments. The algorithms calibrated and tested for prediction were (a) Multiple linear regression (RLM); (b) K Neighbors Regressor (KNN); (c) Random Forest Regressor (RFT), and (d) Artificial Neural Networks (MLP). As dependent variables, we considered the monthly rates of coffee rust, cercospora, coffee miner, and coffee tree borer, and the weather elements were considered as independent (predictor) variables. Pearson correlation analyses were performed considering three different time periods, 1-10 d (from 1 to 10 days before the incidence evaluation), 11-20 d, and 21-30 d, and used to evaluate the unit correlations between the weather variables and infection rates of coffee diseases and pests. The models were calibrated in years of high and low yields, because the biannual variation of harvest yield of coffee beans influences the severity of the diseases. The models were compared by the Willmott's 'd', RMSE (root mean square error), and coefficient of determination (R2) indices. The result of the more accurate algorithm was specialized for the SOMG and Cerrado Mineiro regions using the kriging method. The weather variables that showed significant correlations with coffee rust disease were maximum air temperature, number of days with relative humidity above 80%, and relative humidity. RFT was more accurate in the prediction of coffee rust, cercospora, coffee miner, and coffee borer using weather conditions. In the SOMG, RFT showed a greater accuracy in the predictions for the Cerrado Mineiro in years of high and low yields and for all diseases. In SOMG, the RMSE values ranged from 0.227 to 0.853 for high-yield and 0.147 and 0.827 for low-yield coffee in the coffee borer forecasting.

Keywords: Artificial intelligence; Big data; Crop modeling; Phytosanitary maps.

MeSH terms

  • Algorithms
  • Coffea*
  • Coffee
  • Incidence
  • Machine Learning

Substances

  • Coffee