Using Facebook language to predict and describe excessive alcohol use

Alcohol Clin Exp Res. 2022 May;46(5):836-847. doi: 10.1111/acer.14807. Epub 2022 May 16.

Abstract

Background: Assessing risk for excessive alcohol use is important for applications ranging from recruitment into research studies to targeted public health messaging. Social media language provides an ecologically embedded source of information for assessing individuals who may be at risk for harmful drinking.

Methods: Using data collected on 3664 respondents from the general population, we examine how accurately language used on social media classifies individuals as at-risk for alcohol problems based on Alcohol Use Disorder Identification Test-Consumption score benchmarks.

Results: We find that social media language is moderately accurate (area under the curve = 0.75) at identifying individuals at risk for alcohol problems (i.e., hazardous drinking/alcohol use disorders) when used with models based on contextual word embeddings. High-risk alcohol use was predicted by individuals' usage of words related to alcohol, partying, informal expressions, swearing, and anger. Low-risk alcohol use was predicted by individuals' usage of social, affiliative, and faith-based words.

Conclusions: The use of social media data to study drinking behavior in the general public is promising and could eventually support primary and secondary prevention efforts among Americans whose at-risk drinking may have otherwise gone "under the radar."

Keywords: excessive alcohol use; natural language processing; social media; subclinical drinking.

MeSH terms

  • Alcohol Drinking / epidemiology
  • Alcohol-Related Disorders* / epidemiology
  • Alcoholism* / diagnosis
  • Alcoholism* / epidemiology
  • Humans
  • Language
  • Social Media*