Machine learning models for streamflow regionalization in a tropical watershed

J Environ Manage. 2021 Feb 15:280:111713. doi: 10.1016/j.jenvman.2020.111713. Epub 2020 Nov 27.

Abstract

This study aims to assess different machine learning approaches for streamflow regionalization in a tropical watershed, analyzing their advantages and limitations, and to point the benefits of using them for water resources management. The algorithms applied were: Random Forest, Earth and linear model. The response variables were the three types of minimum streamflow (Q7.10, Q95 and Q90), besides the long-term average streamflow (Qmld). The database involved 76 environmental covariates related to morphometry, topography, climate, land use and cover, and surface conditions. The elimination of covariates was performed using two processes: Pearson's correlation analysis and importance analysis by Recursive Feature Elimination (RFE). To validate the models, the following statistical metrics were used: Nash-Sutcliffe coefficient (NSE), percent bias (PBIAS), Willmott's index of agreement (d), coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE) and relative error (RE). The linear model was unsatisfactory for all response variables. The results show that nonlinear models performed well, and their covariate of greatest predictive importance was flow equivalent to the precipitated volume, considering the subtraction of an abstraction factor of 750 mm (Peq750). Generally, the Random Forest and Earth models showed similar performances and great ability to predict the minimum streamflow and long-term average streamflow assessed, constituting powerful and promising alternatives for the streamflow regionalization in support to the management and integrated planning of water resources at the level of river basins.

Keywords: Artificial intelligence; Hydrological modeling; River flow; Ungauged basins.

MeSH terms

  • Climate
  • Machine Learning
  • Models, Theoretical*
  • Rivers*
  • Water Movements