Vaccine Adverse Event Mining of Twitter Conversations: 2-Phase Classification Study

Sedigheh Khademi Habibabadi; Pari Delir Haghighi; Frada Burstein; Jim Buttery

doi:10.2196/34305

Vaccine Adverse Event Mining of Twitter Conversations: 2-Phase Classification Study

JMIR Med Inform. 2022 Jun 16;10(6):e34305. doi: 10.2196/34305.

Authors

Sedigheh Khademi Habibabadi^#^{1

2}, Pari Delir Haghighi^#³, Frada Burstein³, Jim Buttery^#^{1

4}

Affiliations

¹ Centre for Health Analytics, Melbourne Children's Campus, Melbourne, Australia.
² Department of General Practice, University of Melbourne, Melbourne, Australia.
³ Department of Human-Centred Computing, Faculty of Information Technology, Monash University, Melbourne, Australia.
⁴ Department of Paediatrics, University of Melbourne, Melbourne, Australia.

^# Contributed equally.

PMID: 35708760
PMCID: PMC9247809
DOI: 10.2196/34305

Abstract

Background: Traditional monitoring for adverse events following immunization (AEFI) relies on various established reporting systems, where there is inevitable lag between an AEFI occurring and its potential reporting and subsequent processing of reports. AEFI safety signal detection strives to detect AEFI as early as possible, ideally close to real time. Monitoring social media data holds promise as a resource for this.

Objective: The primary aim of this study is to investigate the utility of monitoring social media for gaining early insights into vaccine safety issues, by extracting vaccine adverse event mentions (VAEMs) from Twitter, using natural language processing techniques. The secondary aims are to document the natural language processing techniques used and identify the most effective of them for identifying tweets that contain VAEM, with a view to define an approach that might be applicable to other similar social media surveillance tasks.

Methods: A VAEM-Mine method was developed that combines topic modeling with classification techniques to extract maximal VAEM posts from a vaccine-related Twitter stream, with high degree of confidence. The approach does not require a targeted search for specific vaccine reaction-indicative words, but instead, identifies VAEM posts according to their language structure.

Results: The VAEM-Mine method isolated 8992 VAEMs from 811,010 vaccine-related Twitter posts and achieved an F₁ score of 0.91 in the classification phase.

Conclusions: Social media can assist with the detection of vaccine safety signals as a valuable complementary source for monitoring mentions of vaccine adverse events. A social media-based VAEM data stream can be assessed for changes to detect possible emerging vaccine safety signals, helping to address the well-recognized limitations of passive reporting systems, including lack of timeliness and underreporting.

Keywords: Twitter; immunization; machine learning; natural language processing; social media; vaccine adverse effects; vaccine safety; vaccines.

©Sedigheh Khademi Habibabadi, Pari Delir Haghighi, Frada Burstein, Jim Buttery. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 16.06.2022.