Predictive modeling of Persian walnut (Juglans regia L.) in vitro proliferation media using machine learning approaches: a comparative study of ANN, KNN and GEP models

Plant Methods. 2022 Apr 11;18(1):48. doi: 10.1186/s13007-022-00871-5.

Abstract

Background: Optimizing plant tissue culture media is a complicated process, which is easily influenced by genotype, mineral nutrients, plant growth regulators (PGRs), vitamins and other factors, leading to undesirable and inefficient medium composition. Facing incidence of different physiological disorders such as callusing, shoot tip necrosis (STN) and vitrification (Vit) in walnut proliferation, it is necessary to develop prediction models for identifying the impact of different factors involving in this process. In the present study, three machine learning (ML) approaches including multi-layer perceptron neural network (MLPNN), k-nearest neighbors (KNN) and gene expression programming (GEP) were implemented and compared to multiple linear regression (MLR) to develop models for prediction of in vitro proliferation of Persian walnut (Juglans regia L.). The accuracy of developed models was evaluated using coefficient of determination (R2), root mean square error (RMSE) and mean absolute error (MAE). With the aim of optimizing the selected prediction models, multi-objective evolutionary optimization algorithm using particle swarm optimization (PSO) technique was applied.

Results: Our results indicated that all three ML techniques had higher accuracy of prediction than MLR, for example, calculated R2 of MLPNN, KNN and GEP vs. MLR was 0.695, 0.672 and 0.802 vs. 0.412 in Chandler and 0.358, 0.377 and 0.428 vs. 0.178 in Rayen, respectively. The GEP models were further selected to be optimized using PSO. The comparison of modeling procedures provides a new insight into in vitro culture medium composition prediction models. Based on the results, hybrid GEP-PSO technique displays good performance for modeling walnut tissue culture media, while MLPNN and KNN have also shown strong estimation capability.

Conclusion: Here, besides MLPNN and GEP, KNN also is introduced, for the first time, as a simple technique with high accuracy to be used for developing prediction models in optimizing plant tissue culture media composition studies. Therefore, selection of the modeling technique to study depends on the researcher's desire regarding the simplicity of the procedure, obtaining clear results as entire formula and/or less time to analyze.

Keywords: Artificial neural network; Gene expression programming; Prediction model; Walnut in vitro propagation; k-nearest neighbors.