Spatiotemporal-aware machine learning approaches for dissolved oxygen prediction in coastal waters

Sci Total Environ. 2023 Dec 20:905:167138. doi: 10.1016/j.scitotenv.2023.167138. Epub 2023 Sep 19.

Abstract

Coastal waters face increasing threats from hypoxia, which can have severe consequences for marine life and fisheries. This study aims to develop a machine learning approach for hypoxia monitoring by investigating the effectiveness of four tree-based models, considering spatiotemporal effects in model prediction, and adopting the SHapley Additive exPlanations (SHAP) approach for model interpretability, using the long-term climate and marine monitoring dataset in Tolo Harbour (Zone 1) and Mirs Bay (Zone 2), Hong Kong. The LightBoost model was found to be the most effective for predicting dissolved oxygen (DO) concentrations using spatiotemporal datasets. Considering spatiotemporal effects improved the model's bottom DO prediction performance (R2 increase 0.30 in Zone1 and 0.68 in Zone 2), although the contributions from temporal and spatial factors varied depending on the complexity of physical and chemical processes. This study focused not only on error estimates but also on model interpretation. Using SHAP, we propose that hypoxia is largely influenced by hydrodynamics, but anthropogenic activities can increase the bias of systems, exacerbating chemical reactions and impacting DO levels. Additionally, the high relative importance of silicate (Zone 1:0.11 and Zone 2: 0.19) in the model suggests that terrestrial sources, particularly submarine groundwater discharge, are important factors influencing coastal hypoxia. This is the first machine learning effort to consider spatiotemporal effects in four dimensions to predict DO concentrations, and we believe it contributes to the development of a forecasting tool for alarming hypoxia, combining real-time data and machine learning models in the near future.

Keywords: Dissolved oxygen; Hypoxia; Machine learning; Prediction; Spatiotemporal factors.