Analysis of mental and physical disorders associated with COVID-19 in online health forums: a natural language processing study

BMJ Open. 2021 Nov 5;11(11):e056601. doi: 10.1136/bmjopen-2021-056601.

Abstract

Objectives: Online health forums provide rich and untapped real-time data on population health. Through novel data extraction and natural language processing (NLP) techniques, we characterise the evolution of mental and physical health concerns relating to the COVID-19 pandemic among online health forum users.

Setting and design: We obtained data from three leading online health forums: HealthBoards, Inspire and HealthUnlocked, from the period 1 January 2020 to 31 May 2020. Using NLP, we analysed the content of posts related to COVID-19.

Primary outcome measures: (1) Proportion of forum posts containing COVID-19 keywords; (2) proportion of forum users making their very first post about COVID-19; (3) proportion of COVID-19-related posts containing content related to physical and mental health comorbidities.

Results: Data from 739 434 posts created by 53 134 unique users were analysed. A total of 35 581 posts (4.8%) contained a COVID-19 keyword. Posts discussing COVID-19 and related comorbid disorders spiked in early March to mid-March around the time of global implementation of lockdowns prompting a large number of users to post on online health forums for the first time. Over a quarter of COVID-19-related thread titles mentioned a physical or mental health comorbidity.

Conclusions: We demonstrate that it is feasible to characterise the content of online health forum user posts regarding COVID-19 and measure changes over time. The pandemic and corresponding public response has had a significant impact on posters' queries regarding mental health. Social media data sources such as online health forums can be harnessed to strengthen population-level mental health surveillance.

Keywords: COVID-19; health informatics; information technology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19*
  • Communicable Disease Control
  • Humans
  • Natural Language Processing
  • Pandemics
  • SARS-CoV-2
  • Social Media*