A novel framework to predict chlorophyll-a concentrations in water bodies through multi-source big data and machine learning algorithms

Environ Sci Pollut Res Int. 2023 Jul;30(32):79402-79422. doi: 10.1007/s11356-023-27886-2. Epub 2023 Jun 7.

Abstract

Eutrophication happens when water bodies are enriched by minerals and nutrients. Dense blooms of noxious are the most obvious effect of eutrophication that harms water quality, and by increasing toxic substances damage the water ecosystem. Therefore, it is critical to monitor and investigate the development process of eutrophication. The concentration of chlorophyll-a (chl-a) in water bodies is an essential indicator of eutrophication in them. Previous studies in predicting chlorophyll-a concentrations suffered from low spatial resolution and discrepancies between predicted and observed values. In this paper, we used various remote sensing and ground observation data and proposed a novel machine learning-based framework, a random forest inversion model, to provide the spatial distribution of chl-a in 2 m spatial resolution. The results showed our model outperformed other base models, and the goodness of fit improved by over 36.6% while MSE and MAE decreased by over 15.17% and over 21.26% respectively. Moreover, we compared the feasibility of GF-1 and Sentinel-2 remote sensing data in chl-a concentration prediction. We found that better prediction results can be obtained by using GF-1 data, with the goodness of fit reaching 93.1% and MSE only 3.589. The proposed method and findings of this study can be used in future water management studies and as an aid for decision-makers in this field.

Keywords: Eutrophication; Random forest; Remote sensing; Spatial analysis; Water pollution.

MeSH terms

  • Algorithms
  • Big Data*
  • Chlorophyll / analysis
  • Chlorophyll A
  • Ecosystem*
  • Environmental Monitoring / methods
  • Eutrophication
  • Lakes
  • Machine Learning

Substances

  • Chlorophyll A
  • Chlorophyll