Data modelling recipes for SARS-CoV-2 wastewater-based epidemiology

Environ Res. 2022 Nov;214(Pt 1):113809. doi: 10.1016/j.envres.2022.113809. Epub 2022 Jul 5.

Abstract

Wastewater based epidemiology is recognized as one of the monitoring pillars, providing essential information for pandemic management. Central in the methodology are data modelling concepts for both communicating the monitoring results but also for analysis of the signal. It is due to the fast development of the field that a range of modelling concepts are used but without a coherent framework. This paper provides for such a framework, focusing on robust and simple concepts readily applicable, rather than applying latest findings from e.g., machine learning. It is demonstrated that data preprocessing, most important normalization by means of biomarkers and equal temporal spacing of the scattered data, is crucial. In terms of the latter, downsampling to a weekly spaced series is sufficient. Also, data smoothing turned out to be essential, not only for communication of the signal dynamics but likewise for regressions, nowcasting and forecasting. Correlation of the signal with epidemic indicators requires multivariate regression as the signal alone cannot explain the dynamics but - for this case study - multiple linear regression proofed to be a suitable tool when the focus is on understanding and interpretation. It was also demonstrated that short term prediction (7 days) is accurate with simple models (exponential smoothing or autoregressive models) but forecast accuracy deteriorates fast for longer periods.

Keywords: Data modelling; Forecast; Regression; SARS-CoV-2; Smoothing; Wastewater-based epidemiology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19*
  • Forecasting
  • Humans
  • Pandemics
  • SARS-CoV-2*
  • Wastewater
  • Wastewater-Based Epidemiological Monitoring

Substances

  • Waste Water