Prediction of Greenhouse Tomato Crop Evapotranspiration Using XGBoost Machine Learning Model

Plants (Basel). 2022 Jul 25;11(15):1923. doi: 10.3390/plants11151923.

Abstract

Crop evapotranspiration estimation is a key parameter for achieving functional irrigation systems. However, ET is difficult to directly measure, so an ideal solution was to develop a simulation model to obtain ET. There are many ways to calculate ET, most of which use models based on the Penman−Monteith equation, but they are often inaccurate when applied to greenhouse crop evapotranspiration. The use of machine learning models to predict ET has gradually increased, but research into their application for greenhouse crops is relatively rare. We used experimental data for three years (2019−2021) to model the effects on ET of eight meteorological factors (net solar radiation (Rn), mean temperature (Ta), minimum temperature (Tamin), maximum temperature (Tamax), relative humidity (RH), minimum relative humidity (RHmin), maximum relative humidity (RHmax), and wind speed (V)) using a greenhouse drip irrigated tomato crop ET prediction model (XGBR-ET) that was based on XGBoost regression (XGBR). The model was compared with seven other common regression models (linear regression (LR), support vector regression (SVR), K neighbors regression (KNR), random forest regression (RFR), AdaBoost regression (ABR), bagging regression (BR), and gradient boosting regression (GBR)). The results showed that Rn, Ta, and Tamax were positively correlated with ET, and that Tamin, RH, RHmin, RHmax, and V were negatively correlated with ET. Rn had the greatest correlation with ET (r = 0.89), and V had the least correlation with ET (r = 0.43). The eight models were ordered, in terms of prediction accuracy, XGBR-ET > GBR-ET > SVR-ET > ABR-ET > BR-ET > LR-ET > KNR-ET > RFR-ET. The statistical indicators mean square error (0.032), root mean square error (0.163), mean absolute error (0.132), mean absolute percentage error (4.47%), and coefficient of determination (0.981) of XGBR-ET showed that XGBR-ET modeled daily ET for greenhouse tomatoes well. The parameters of the XGBR-ET model were ablated to show that the order of importance of meteorological factors on XGBR-ET was Rn > RH > RHmin> Tamax> RHmax> Tamin> Ta> V. Selecting Rn, RH, RHmin, Tamax, and Tamin as model input variables using XGBR ensured the prediction accuracy of the model (mean square error 0.047). This study has value as a reference for the simplification of the calculation of evapotranspiration for drip irrigated greenhouse tomato crops using a novel application of machine learning as a basis for an effective irrigation program.

Keywords: XGBoost regression; drip irrigated tomato; evapotranspiration; machine learning; solar greenhouse.