Automatic Identification of Self-Reported COVID-19 Vaccine Information from Vaccine Adverse Events Reporting System

Methods Inf Med. 2023 May;62(1-02):49-59. doi: 10.1055/s-0042-1760248. Epub 2023 Jan 9.

Abstract

Background: The short time frame between the coronavirus disease 2019 (COVID-19) pandemic declaration and the vaccines authorization led to concerns among public regarding the safety and efficacy of the vaccines. The Food and Drug Administration uses the Vaccine Adverse Events Reporting System (VAERS) where general population can report their vaccine side effects in the text box. This information could be utilized to determine self-reported vaccine side effects.

Objective: To develop a supervised and unsupervised natural language processing (NLP) pipeline to extract self-reported COVID-19 vaccination side effects, location of the side effects, medications, and possibly false/misinformation seeking further investigation in a structured format for analysis and reporting.

Methods: We utilized the VAERS dataset of COVID-19 vaccine reports from November 2020 to August 2022 of 725,246 individuals. We first developed a gold-standard (GS) dataset of randomly selected 1,500 records. Second, the GS was split into training, testing, and validation sets. The training dataset was used to develop the NLP applications (supervised and unsupervised) and testing and validation datasets were used to test the performances of the NLP application.

Results: The NLP application automatically extracted vaccine side effects, body locations of the side effects, medication, and possibly misinformation with moderate to high accuracy (84% sensitivity, 82% specificity, and 83% F-1 measure). We found that 23% people (386,270) faced arm soreness, 31% body swelling (226,208), 23% fatigue/body weakness (168,160), and 22% (159,873) cold/flue-like symptoms. Most of the complications occurred in the body locations such as the arm, back, chest, neck, face, and head. Over-the-counter pain medications such as Tylenol and Ibuprofen and allergy medication like Benadryl were most reported self-reported medications. Death due to COVID-19, changes in the DNA, and infertility were possible false/misinformation reported by people.

Conclusion: Some self-reported side effects such as syncope, arthralgia, and blood clotting need further clinical investigations. Our NLP application may help in extracting information from big free-text electronic datasets to help policy makers and other researchers with decision making.

MeSH terms

  • Adverse Drug Reaction Reporting Systems
  • COVID-19 Vaccines / adverse effects
  • COVID-19* / epidemiology
  • COVID-19* / prevention & control
  • Drug-Related Side Effects and Adverse Reactions* / epidemiology
  • Humans
  • Self Report
  • Vaccines* / adverse effects

Substances

  • COVID-19 Vaccines
  • Vaccines