Temporal and Location Variations, and Link Categories for the Dissemination of COVID-19-Related Information on Twitter During the SARS-CoV-2 Outbreak in Europe: Infoveillance Study

J Med Internet Res. 2020 Aug 28;22(8):e19629. doi: 10.2196/19629.

Abstract

Background: The spread of the 2019 novel coronavirus disease, COVID-19, across Asia and Europe sparked a significant increase in public interest and media coverage, including on social media platforms such as Twitter. In this context, the origin of information plays a central role in the dissemination of evidence-based information about the SARS-CoV-2 virus and COVID-19. On February 2, 2020, the World Health Organization (WHO) constituted a "massive infodemic" and argued that this situation "makes it hard for people to find trustworthy sources and reliable guidance when they need it."

Objective: This infoveillance study, conducted during the early phase of the COVID-19 pandemic, focuses on the social media platform Twitter. It allows monitoring of the dynamic pandemic situation on a global scale for different aspects and topics, languages, as well as regions and even whole countries. Of particular interest are temporal and geographical variations of COVID-19-related tweets, the situation in Europe, and the categories and origin of shared external resources.

Methods: Twitter's Streaming application programming interface was used to filter tweets based on 16 prevalent hashtags related to the COVID-19 outbreak. Each tweet's text and corresponding metadata as well as the user's profile information were extracted and stored into a database. Metadata included links to external resources. A link categorization scheme-introduced in a study by Chew and Eysenbach in 2009-was applied onto the top 250 shared resources to analyze the relative proportion for each category. Moreover, temporal variations of global tweet volumes were analyzed and a specific analysis was conducted for the European region.

Results: Between February 9 and April 11, 2020, a total of 21,755,802 distinct tweets were collected, posted by 4,809,842 distinct Twitter accounts. The volume of #covid19-related tweets increased after the WHO announced the name of the new disease on February 11, 2020, and stabilized at the end of March at a high level. For the regional analysis, a higher tweet volume was observed in the vicinity of major European capitals or in densely populated areas. The most frequently shared resources originated from various social media platforms (ranks 1-7). The most prevalent category in the top 50 was "Mainstream or Local News." For the category "Government or Public Health," only two information sources were found in the top 50: US Centers for Disease Control and Prevention at rank 25 and the WHO at rank 27. The first occurrence of a prevalent scientific source was Nature (rank 116).

Conclusions: The naming of the disease by the WHO was a major signal to address the public audience with public health response via social media platforms such as Twitter. Future studies should focus on the origin and trustworthiness of shared resources, as monitoring the spread of fake news during a pandemic situation is of particular importance. In addition, it would be beneficial to analyze and uncover bot networks spreading COVID-19-related misinformation.

Keywords: COVID-19; SARS-CoV-2; Twitter; disease surveillance; health informatics; infodemic; infodemiology; infoveillance; public health; social media.

MeSH terms

  • Betacoronavirus / pathogenicity*
  • COVID-19
  • Coronavirus Infections / epidemiology*
  • Disease Outbreaks
  • Europe
  • Humans
  • Pandemics
  • Pneumonia, Viral / epidemiology*
  • SARS-CoV-2
  • Social Media / standards*