Global marine phytoplankton dynamics analysis with machine learning and reanalyzed remote sensing

PeerJ. 2024 May 8:12:e17361. doi: 10.7717/peerj.17361. eCollection 2024.

Abstract

Phytoplankton are the world's largest oxygen producers found in oceans, seas and large water bodies, which play crucial roles in the marine food chain. Unbalanced biogeochemical features like salinity, pH, minerals, etc., can retard their growth. With advancements in better hardware, the usage of Artificial Intelligence techniques is rapidly increasing for creating an intelligent decision-making system. Therefore, we attempt to overcome this gap by using supervised regressions on reanalysis data targeting global phytoplankton levels in global waters. The presented experiment proposes the applications of different supervised machine learning regression techniques such as random forest, extra trees, bagging and histogram-based gradient boosting regressor on reanalysis data obtained from the Copernicus Global Ocean Biogeochemistry Hindcast dataset. Results obtained from the experiment have predicted the phytoplankton levels with a coefficient of determination score (R2) of up to 0.96. After further validation with larger datasets, the model can be deployed in a production environment in an attempt to complement in-situ measurement efforts.

Keywords: Global waters; Machine learning; Ocean biogeochemistry; Phytoplankton; Regression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Environmental Monitoring / methods
  • Machine Learning*
  • Oceans and Seas
  • Phytoplankton*
  • Remote Sensing Technology* / instrumentation
  • Remote Sensing Technology* / methods
  • Supervised Machine Learning

Grants and funding

The authors received no funding for this work.