A novel chlorophyll-a retrieval model based on suspended particulate matter classification and different machine learning

Environ Res. 2024 Jan 1;240(Pt 1):117430. doi: 10.1016/j.envres.2023.117430. Epub 2023 Oct 20.

Abstract

Chlorophyll-a (Chla) in inland waters is one of the most significant optical parameters of aquatic ecosystem assessment, and long-term and daily Chla concentration monitoring has the potential to facilitate in early warning of algal blooms. MOD09 products have multiple observation advantages (higher temporal, spatial resolution and signal-to-noise ratio), and play an extremely important role in the remote sensing of water color. For developing a high accuracy machine learning model of remotely estimating Chla concentration in inland waters based on MOD09 products, this study proposed an assumption that the accuracy of Chla concentration retrieval will be improved after classifying water bodies into three groups by suspended particulate matter (SPM) concentration. A total of 10 commonly used machine learning models were compared and evaluated in this study, including random forest regressor (RFR), deep neural networks (DNN), extreme gradient boosting (XGBoost), and convolutional neural network (CNN). Altogether, 41 basic bands and 820 band ratios between the 41 bands were filtered by measuring their correlation with Ln(Chla) and several bands brought into different machine learning models. Results demonstrated that the construction of Chla concentration remote estimation model based on SPM classification could significantly improve the correlation between Ln(Chla) and 41 basic spectral band combinations, the correlation between Ln(Chla) and 820 band ratios, and the model verification R2 from 0.41 to 0.83. Furthermore, B3, B20, and B32 were finally selected based on correlation with SPM to classify SPM and the classification accuracy could reach 0.9. Finally, we concluded that RFR model performed best via comparing the R2, RMSE, and MAPE. By comparing the relative contribution of input bands in different groups, B3 contributed most to three groups. The model constructed in this study has promising prospects for promotion and application in other inland waters, and could provide systematic research reference for subsequent research.

Keywords: MOD09 products; Multiple machine learning models; Satellite.

MeSH terms

  • Algorithms
  • Chlorophyll / analysis
  • Chlorophyll A
  • Ecosystem*
  • Environmental Monitoring* / methods
  • Water

Substances

  • Chlorophyll A
  • Chlorophyll
  • Water