Automatically detecting and understanding the perception of COVID-19 vaccination: a middle east case study

Soc Netw Anal Min. 2022;12(1):128. doi: 10.1007/s13278-022-00946-0. Epub 2022 Sep 4.

Abstract

Introduction: The development of COVID-19 vaccines has been a great relief in many countries that have been affected by the pandemic. As a result, many governments have made significant efforts to purchase and administer vaccines to their populations. However, accommodating such vaccines is typically confronted with people's reluctance and fear. Like any other important event, COVID-19 vaccines have attracted people's discussions on social media and impacted their opinions about vaccination.

Objective: The goal of this study is twofold: First, it conducts a sentiment analysis around COVID-19 vaccines by automatically analyzing Arabic users' tweets. This analysis has been spread over time to better capture the changes in vaccine perceptions. This will provide us with some insights into the most popular and accepted vaccine(s) in the Arab countries, as well as the reasons behind people's reluctance to take the vaccine. Second, it develops models to detect any vaccine-related tweets, to help with gathering all information related to people's perception of the virus, and potentially detecting vaccine-related tweets that are not necessarily tagged with the virus's main hashtags.

Methods: Arabic Tweets were collected by the authors, starting from January 1st, 2021, until April 20th, 2021. We deployed various Natural Language Processing (NLP) to distill our selected tweets. The curated dataset included in the analysis consisted of 1,098,376 unique tweets. To achieve the first goal, we designed state-of-the-art sentiment analysis techniques to extract knowledge related to the degree of acceptance of all existing vaccines and what are the main obstacles preventing the wide audience from accepting them. To achieve the second goal, we tackle the detection of vaccine-related tweets as a binary classification problem, where various Machine Learning (ML) models were designed to identify such tweets regardless of whether they use the vaccine hashtags or not.

Results: Generally, we found that the highest positive sentiments were registered for Pfizer-BioNTech, followed by Sinopharm-BIBP and Oxford-AstraZeneca. In addition, we found that 38% of the overall tweets showed negative sentiment, and only 12% had a positive sentiment. It is important to note that the majority of the sentiments vary between neutral and negative, showing the lack of conviction of the importance of vaccination among the large majority of tweeters. This paper extracts the top concerns raised by the tweets and advocates for taking them into account when advertising for the vaccination. Regarding the identification of vaccine-related tweets, the Logistic Regression model scored the highest accuracy of 0.82. Our findings are concluded with implications for public health authorities and the scholarly community to take into account to improve the vaccine's acceptance.

Keywords: COVID-19 Vaccine; COVID-19 pandemic; Deep learning; Machine learning; NLP; Sentiment Analysis; Vaccine uptake.