A novel dynamic multi-criteria ensemble selection mechanism applied to drinking water quality anomaly detection

Sci Total Environ. 2020 Dec 20:749:142368. doi: 10.1016/j.scitotenv.2020.142368. Epub 2020 Sep 15.

Abstract

The provision of clean and safe drinking water is a crucial task for water supply companies from all over the world. To this end, automatic anomaly detection plays a critical role in drinking water quality monitoring. Recent anomaly detection studies use techniques that focus on a single global objective. Yet, companies need solutions that better balance the trade-off between false positives (FPs), which lead to financial losses to water companies, and false negatives (FNs), which severely impact public health and damage the environment. This work proposes a novel dynamic multi-criteria ensemble selection mechanism to cope with both problems simultaneously: the non-dominated local class-specific accuracy (NLCA). Moreover, experiments rely on recent time series related classification metrics to assess the predictive performance. Results on data from a real-world water distribution system show that NLCA outperforms other ensemble learning and dynamic ensemble selection techniques by more than 15% in terms of time series related F1 scores. As a conclusion, NLCA enables the development of stronger anomaly detection systems for drinking water quality monitoring. The proposed technique also offers a new perspective on dynamic ensemble selection, which can be applied to different classification tasks to balance conflicting criteria.

Keywords: Anomaly detection; Drinking water quality; Dynamic ensemble selection; Ensemble learning; Machine learning; Time series classification.

MeSH terms

  • Algorithms*
  • Drinking Water*
  • Water Supply

Substances

  • Drinking Water