Mining association rules from COVID-19 related twitter data to discover word patterns, topics and inferences

Inf Syst. 2022 Nov:109:102054. doi: 10.1016/j.is.2022.102054. Epub 2022 Apr 25.

Abstract

This work utilizes data from Twitter to mine association rules and extract knowledge about public attitudes regarding worldwide crises. It exploits the COVID-19 pandemic as a use case, and analyzes tweets gathered between February and August 2020. The proposed methodology comprises topic extraction and visualization techniques, such as WordClouds, to form clusters or themes of opinions. It then uses Association Rule Mining (ARM) to discover frequent wordsets and generate rules that infer to user attitudes. The goal is to utilize ARM as a postprocessing technique to enhance the output of any topic extraction method. Therefore, only strong wordsets are stored after discarding trivia ones. We also employ frequent wordset identification to reduce the number of extracted topics. Our findings showcase that 50 initially retrieved topics are narrowed down to just 4, when combining Latent Dirichlet Allocation with ARM. Our methodology facilitates producing more accurate and generalizable results, whilst exposing implications regarding social media user attitudes.

Keywords: Association rule mining; COVID-19; Data mining; Social media; Topic extraction.