Estimating the prevalence of COVID-19 cases through the analysis of SARS-CoV-2 RNA copies derived from wastewater samples from North Dakota

Glob Epidemiol. 2023 Oct 12:6:100124. doi: 10.1016/j.gloepi.2023.100124. eCollection 2023 Dec.

Abstract

The SARS-CoV-2 virus was first detected in December 2019, which prompted many researchers to investigate how the virus spreads. SARS-CoV-2 is mainly transmitted through respiratory droplets. Symptoms of the SARS-CoV-2 virus appear after an incubation period. Moreover, the asymptomatic infected individuals unknowingly spread the virus. Detecting infected people requires daily tests and contact tracing, which are expensive. The early detection of infectious diseases, including COVID-19, can be achieved with wastewater-based epidemiology, which is timely and cost-effective. In this study, we collected wastewater samples from wastewater treatment plants in several cities in North Dakota and then extracted viral RNA copies. We used log-RNA copies in the model to predict the number of infected cases using Quantile Regression (QR) and K-Nearest Neighbor (KNN) Regression. The model's performance was evaluated by comparing the Mean Absolute Percentage Error (MAPE). The QR model performs well in cities where the population is >10000. In addition, the model predictions were compared with the basic Susceptible-Infected-Recovered (SIR) model which is the golden standard model for infectious diseases.

Keywords: COVID-19; K-nearest neighbor regression; Mean absolute percentage error; Quantile regression; SARS-CoV-2; SIR model.