Early detection of norovirus outbreak using machine learning methods in South Korea

PLoS One. 2022 Nov 16;17(11):e0277671. doi: 10.1371/journal.pone.0277671. eCollection 2022.

Abstract

Background: The norovirus is a major cause of acute gastroenteritis at all ages but particularly has a high chance of affecting children under the age of five. Given that the outbreak of norovirus in Korea is seasonal, it is important to try and predict the start and end of norovirus outbreaks.

Methods: We predicted weekly norovirus warnings using six machine learning algorithms using test data from 2017 to 2018 and training data from 2009 to 2016. In addition, we proposed a novel method for the early detection of norovirus using a calculated norovirus risk index. Further, feature importance was calculated to evaluate the contribution of the estimated weekly norovirus warnings.

Results: The long short-term memory machine learning (LSTM) algorithm proved to be the best algorithm for predicting weekly norovirus warnings, with 97.2% and 92.5% accuracy in the training and test data, respectively. The LSTM algorithm predicted the observed start and end weeks of the early detection of norovirus within a 3-week range.

Conclusions: The results of this study show that early detection can provide important insights for the preparation and control of norovirus outbreaks by the government. Our method provides indicators of high-risk weeks. In particular, last norovirus detection rate, minimum temperature, and day length, play critical roles in estimating weekly norovirus warnings.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Caliciviridae Infections* / diagnosis
  • Caliciviridae Infections* / epidemiology
  • Child
  • Disease Outbreaks
  • Gastroenteritis* / diagnosis
  • Gastroenteritis* / epidemiology
  • Humans
  • Machine Learning
  • Norovirus*

Grants and funding

This work was supported by the BK21 FOUR Program funded by the Pusan National University Research Grant, 2020, the National Research Foundation of Korea (NRF) funded by the Korean Government (MSIT) (NRF-2020R1C1C1A01012557), and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2021R1A2B5B03087097). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.