A hybrid deep learning approach to predict hourly riverine nitrate concentrations using routine monitored data

Yue Hu; Chuankun Liu; Wilfred M Wollheim; Tong Jiao; Meng Ma

doi:10.1016/j.jenvman.2024.121097

A hybrid deep learning approach to predict hourly riverine nitrate concentrations using routine monitored data

J Environ Manage. 2024 May 10:360:121097. doi: 10.1016/j.jenvman.2024.121097. Online ahead of print.

Authors

Yue Hu¹, Chuankun Liu², Wilfred M Wollheim³, Tong Jiao¹, Meng Ma⁴

Affiliations

¹ State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu, 610059, China.
² Sichuan Academy of Environmental Policy and Planning, Department of Ecology and Environment of Sichuan Province, Chengdu, 610059, China. Electronic address: liuchuankun@pku.edu.cn.
³ Department of Natural Resources and Environment, University of New Hampshire, Durham, NH, 03824, USA.
⁴ China Institute of Water Resources and Hydropower Research, Beijing, 100048, China.

PMID: 38733844
DOI: 10.1016/j.jenvman.2024.121097

Abstract

With high-frequency data of nitrate (NO₃-N) concentrations in waters becoming increasingly important for understanding of watershed system behaviors and ecosystem managements, the accurate and economic acquisition of high-frequency NO₃-N concentration data has become a key point. This study attempted to use coupled deep learning neural networks and routine monitored data to predict hourly NO₃-N concentrations in a river. The hourly NO₃-N concentration at the outlet of the Oyster River watershed in New Hampshire, USA, was predicted through neural networks with a hybrid model architecture coupling the Convolutional Neural Networks and the Long Short-Term Memory model (CNN-LSTM). The routine monitored data (the river depth, water temperature, air temperature, precipitation, specific conductivity, pH and dissolved oxygen concentrations) for model training were collected from a nested high-frequency monitoring network, while the high-frequency NO₃-N concentration data obtained at the outlet were not included as inputs. The whole dataset was separated into training, validation, and testing processes according to the ratio of 5:3:2, respectively. The hybrid CNN-LSTM model with different input lengths (1d, 3d, 7d, 15d, 30d) displayed comparable even better performance than other studies with lower frequencies, showing mean values of the Nash-Sutcliffe Efficiency 0.60-0.83. Models with shorter input lengths demonstrated both the higher modeling accuracy and stability. The water level, water temperature and pH values at monitoring sites were main controlling factors for forecasting performances. This study provided a new insight of using deep learning networks with a coupled architecture and routine monitored data for high-frequency riverine NO₃-N concentration forecasting and suggestions about strategies about variable and input length selection during preprocessing of input data.

Keywords: CNN; Data pre-processing; LSTM; Nitrate export pattern; Nitrate forecasting.