Super ensemble based streamflow simulation using multi-source remote sensing and ground gauged rainfall data fusion

Heliyon. 2023 Jul 6;9(7):e17982. doi: 10.1016/j.heliyon.2023.e17982. eCollection 2023 Jul.

Abstract

Traditional data-driven streamflow predictions usually apply a single model with inconsistent performance in different variability conditions. These days model ensembles or merging the benefits of different models without losing the general character of the data are becoming a trend in hydrology. This study compared three super ensemble learners with eight base models. Twelve years of monthly rolled daily time series data in three river catchments of Ethiopia (Borkena watershed: Awash river basin), (Gummera watershed: Abay river basin), and (Sore watershed: Baro Akobo river basin) is used for single-step daily streamflow simulation using previous thirty-day input timesteps. Five input scenarios are applied: three vegetation indices, three remote sensing-based precipitation products, ground-gauged rainfall, all fused inputs, and selected inputs with the Recursive Feature Elimination (RFE) algorithm. The time series is then divided into training and testing datasets with a ratio of 80:20. The performance of the proposed models was evaluated using the Root Mean Squared Error (RMSE), coefficient of determination (R2), Mean Absolute Error (MAE), and Median Absolute Error (MEDAE). Finally, the result is presented with the corresponding five input scenarios. The catchment's and input scenario's average performance indicated that the three super ensemble learners outperformed the eight base models with relatively stable performance. The top-ranked WASE model exceeded the linear regression baseline by 13.3%. XGB, CNN-GRU, and LSTM proved the highest performance of the base models. This study also revealed that LSTM's key downside is its performance drop in the absence of feature selection criteria. In comparison, XGB showed its superior performance after controlling redundant inputs internally. Moreover, this study uniquely highlights the potential of remote sensing-based vegetation indices in the science of data-driven streamflow modelling for non-gauged catchments with no meteorological time series.

Keywords: Ethiopian river basins; Gauge-rainfall; Remote sensing; Streamflow prediction; Super ensemble learning.