Automated scraping and analyses of drinking water quality data

Int J Hyg Environ Health. 2024 Jan:255:114295. doi: 10.1016/j.ijheh.2023.114295. Epub 2023 Nov 22.

Abstract

Drinking water quality data, though regularly monitored, is not available in Germany as national overview, but only decentralized from the water suppliers. On the national level, only the number of limit exceedances are reported. An overview on drinking water qualities as complete as possible however is necessary to assess and develop regulations and helpful for authorities, political decision makers, the public and the scientific community. Due to the fragmented nature of the data sources, web-scraping was used in the present study to mitigate aforementioned challenges and knowledge gaps. Data from 502 water supply areas were compiled and further evaluated. The extent and form of reported values varied strongly, as did the availability of data for the different water supply areas. The results show, that the scraped values were not close to but well below associated legal limits or guidance values. For organic parameters, the reported values were mostly below the respective limits of quantification. However, further developments are needed to cover more water supply areas in Germany and internationally.

Keywords: Data harvesting; Data mining; Drinking water data; Information of drinking water consumers; Regulatory limit; Reporting.

Publication types

  • Review

MeSH terms

  • Drinking Water*
  • Environmental Monitoring
  • Germany
  • Water Quality
  • Water Supply

Substances

  • Drinking Water