A stance data set on polarized conversations on Twitter about the efficacy of hydroxychloroquine as a treatment for COVID-19

Data Brief. 2020 Dec:33:106401. doi: 10.1016/j.dib.2020.106401. Epub 2020 Oct 15.

Abstract

At the time of this study, the SARS-CoV-2 virus that caused the COVID-19 pandemic has spread significantly across the world. Considering the uncertainty about policies, health risks, financial difficulties, etc. the online media, especially the Twitter platform, is experiencing a high volume of activity related to this pandemic. Among the hot topics, the polarized debates about unconfirmed medicines for the treatment and prevention of the disease have attracted significant attention from online media users. In this work, we present a stance data set, COVID-CQ, of user-generated content on Twitter in the context of COVID-19. We investigated more than 14 thousand tweets and manually annotated the tweet initiators' opinions regarding the use of "chloroquine" and "hydroxychloroquine" for the treatment or prevention of COVID-19. To the best of our knowledge, COVID-CQ is the first data set of Twitter users' stances in the context of the COVID-19 pandemic, and the largest Twitter data set on users' stances towards a claim, in any domain. We have made this data set available to the research community via the Mendeley Data repository. We expect this data set to be useful for many research purposes, including stance detection, evolution and dynamics of opinions regarding this outbreak, and changes in opinions in response to the exogenous shocks such as policy decisions and events.

Keywords: COVID-19; Coronavirus; Hydroxychloroquine; Opinion mining; Polarity; Social media; Stance classification; Twitter.