An integrated feature selection approach to high water stress yield prediction

Front Plant Sci. 2023 Dec 4:14:1289692. doi: 10.3389/fpls.2023.1289692. eCollection 2023.

Abstract

The timely and precise prediction of winter wheat yield plays a critical role in understanding food supply dynamics and ensuring global food security. In recent years, the application of unmanned aerial remote sensing has significantly advanced agricultural yield prediction research. This has led to the emergence of numerous vegetation indices that are sensitive to yield variations. However, not all of these vegetation indices are universally suitable for predicting yields across different environments and crop types. Consequently, the process of feature selection for vegetation index sets becomes essential to enhance the performance of yield prediction models. This study aims to develop an integrated feature selection method known as PCRF-RFE, with a focus on vegetation index feature selection. Initially, building upon prior research, we acquired multispectral images during the flowering and grain filling stages and identified 35 yield-sensitive multispectral indices. We then applied the Pearson correlation coefficient (PC) and random forest importance (RF) methods to select relevant features for the vegetation index set. Feature filtering thresholds were set at 0.53 and 1.9 for the respective methods. The union set of features selected by both methods was used for recursive feature elimination (RFE), ultimately yielding the optimal subset of features for constructing Cubist and Recurrent Neural Network (RNN) yield prediction models. The results of this study demonstrate that the Cubist model, constructed using the optimal subset of features obtained through the integrated feature selection method (PCRF-RFE), consistently outperformed the RNN model. It exhibited the highest accuracy during both the flowering and grain filling stages, surpassing models constructed using all features or subsets derived from a single feature selection method. This confirms the efficacy of the PCRF-RFE method and offers valuable insights and references for future research in the realms of feature selection and yield prediction studies.

Keywords: Cubist; PCRF-RFE; RNN; multispectral; performance.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Key Grant Technology Project of Henan (221100110700), Central Public-interest Scientific Institution Basal Research Fund (No. IFI2023-29, IFI2023-01), the Intelligent Irrigation Water and Fertilizer Digital Decision System and Regulation Equipment (2022YFD1900404), and the Agricultural Science and Technology Innovation Program (ASTIPNo. CAAS-ZDRW202201).