Epitweetr: Early warning of public health threats using Twitter data

Euro Surveill. 2022 Sep;27(39):2200177. doi: 10.2807/1560-7917.ES.2022.27.39.2200177.

Abstract

BackgroundThe European Centre for Disease Prevention and Control (ECDC) systematically collates information from sources to rapidly detect early public health threats. The lack of a freely available, customisable and automated early warning tool using data from Twitter prompted the ECDC to develop epitweetr, which collects, geolocates and aggregates tweets generating signals and email alerts.AimThis study aims to compare the performance of epitweetr to manually monitoring tweets for the purpose of early detecting public health threats.MethodsWe calculated the general and specific positive predictive value (PPV) of signals generated by epitweetr between 19 October and 30 November 2020. Sensitivity, specificity, timeliness and accuracy and performance of tweet geolocation and signal detection algorithms obtained from epitweetr and the manual monitoring of 1,200 tweets were compared.ResultsThe epitweetr geolocation algorithm had an accuracy of 30.1% at national, and 25.9% at subnational levels. The signal detection algorithm had 3.0% general PPV and 74.6% specific PPV. Compared to manual monitoring, epitweetr had greater sensitivity (47.9% and 78.6%, respectively), and reduced PPV (97.9% and 74.6%, respectively). Median validation time difference between 16 common events detected by epitweetr and manual monitoring was -48.6 hours (IQR: -102.8 to -23.7).ConclusionEpitweetr has shown sufficient performance as an early warning tool for public health threats using Twitter data. Since epitweetr is a free, open-source tool with configurable settings and a strong automated component, it is expected to increase in usability and usefulness to public health experts.

Keywords: Twitter; early warning; epidemic intelligence; machine learning; public health.

MeSH terms

  • Algorithms
  • Data Collection
  • Humans
  • Public Health*
  • Social Media*