Track Iran's national COVID-19 response committee's major concerns using two-stage unsupervised topic modeling

Int J Med Inform. 2021 Jan:145:104309. doi: 10.1016/j.ijmedinf.2020.104309. Epub 2020 Nov 4.

Abstract

Background: Since the World Health Organization (WHO) declared the COVID-19 as a Public Health Emergency of International Concern (PHEIC) on January 31, 2020, governments have been enfaced with crisis for timely responses. The efficacy of these responses directly depends on the social behaviors of the target society. People react to these actions with respect to the information they received from different channels, such as news and social networks. Thus, analyzing news demonstrates a brief view of the information users received during the outbreak.

Methods: The raw data used in this study is collected from official news channels of news wires and agencies in Telegram messenger, which exceeds 2,400,000 posts. The posts that are quoted by NCRC's members are collected, cleaned, and divided into sentences. The topic modeling and tracking are utilized in a two-stage framework, which is customized for this problem to separate miscellaneous sentences from those presenting concerns. The first stage is fed with embedding vectors of sentences where they are grouped by the Mapper algorithm. Sentences belonging to singleton nodes are labeled as miscellaneous sentences. The remained sentences are vectorized, adopting Tf-IDF weighting schema in the second stage and topically modeled by the LDA method. Finally, relevant topics are aligned to the list of policies and actions, named topic themes, that are set up by the NCRC.

Results: Our results show that major concerns presented in about half of the sentences are (1) PCR lab. test, diagnosis, and screening, (2) Closure of the education system, and (3) awareness actions about washing hands and facial mask usage. Among the eight themes, intra-provincial travel and traffic restrictions, as well as briefing the national and provincial status, are under-presented. The timeline of concerns annotated by the preventive actions illustrates the changes in concerns addressed by NCRC. This timeline shows that although the announcements and public responses are not lagged behind the events, but cannot be considered as timely. Furthermore, the fluctuating series of concerns reveal that the NCRC has not a long-time response map, and members react to the closest announced policy/act.

Conclusion: The results of our study can be used as a quantitative indicator for evaluating the availability of an on-time public response of Iran's NCRC during the first three months of the outbreak. Moreover, it can be used in comparative studies to investigate the differences between awareness acts in various countries. Results of our customized-design framework showed that about one-third of the discussions of the NCRC's members cover miscellaneous topics that must be removed from the data.

Keywords: COVID-199; Government response; News mining; Public health crisis; Sentence embedding; Topic modeling; Topological data analysis.

MeSH terms

  • COVID-19*
  • Consumer Health Informatics
  • Data Mining
  • Disease Outbreaks
  • Humans
  • Iran
  • Public Health
  • SARS-CoV-2
  • Social Networking*