Coastal Water Quality Modelling Using E. coli, Meteorological Parameters and Machine Learning Algorithms

Int J Environ Res Public Health. 2023 Jun 24;20(13):6216. doi: 10.3390/ijerph20136216.

Abstract

In this study, machine learning models were implemented to predict the classification of coastal waters in the region of Eastern Macedonia and Thrace (EMT) concerning Escherichia coli (E. coli) concentration and weather variables in the framework of the Directive 2006/7/EC. Six sampling stations of EMT, located on beaches of the regional units of Kavala, Xanthi, Rhodopi, Evros, Thasos and Samothraki, were selected. All 1039 samples were collected from May to September within a 14-year follow-up period (2009-2021). The weather parameters were acquired from nearby meteorological stations. The samples were analysed according to the ISO 9308-1 for the detection and the enumeration of E. coli. The vast majority of the samples fall into category 1 (Excellent), which is a mark of the high quality of the coastal waters of EMT. The experimental results disclose, additionally, that two-class classifiers, namely Decision Forest, Decision Jungle and Boosted Decision Tree, achieved high Accuracy scores over 99%. In addition, comparing our performance metrics with those of other researchers, diversity is observed in using algorithms for water quality prediction, with algorithms such as Decision Tree, Artificial Neural Networks and Bayesian Belief Networks demonstrating satisfactory results. Machine learning approaches can provide critical information about the dynamic of E. coli contamination and, concurrently, consider the meteorological parameters for coastal waters classification.

Keywords: E. coli; coastal water; machine learning; pollution; predictive modelling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Escherichia coli*
  • Machine Learning
  • Water Quality*

Grants and funding

This research received no external funding.