High-resolution Temporal Representations of Alcohol and Tobacco Behaviors from Social Media Data

Proc ACM Hum Comput Interact. 2017 Nov;1(CSCW):54. doi: 10.1145/3134689.

Abstract

Understanding tobacco- and alcohol-related behavioral patterns is critical for uncovering risk factors and potentially designing targeted social computing intervention systems. Given that we make choices multiple times per day, hourly and daily patterns are critical for better understanding behaviors. Here, we combine natural language processing, machine learning and time series analyses to assess Twitter activity specifically related to alcohol and tobacco consumption and their sub-daily, daily and weekly cycles. Twitter self-reports of alcohol and tobacco use are compared to other data streams available at similar temporal resolution. We assess if discussion of drinking by inferred underage versus legal age people or discussion of use of different types of tobacco products can be differentiated using these temporal patterns. We find that time and frequency domain representations of behaviors on social media can provide meaningful and unique insights, and we discuss the types of behaviors for which the approach may be most useful.

Keywords: Content analysis and feature selection; Information systems → Web and social media search; Mathematics of computing→Time series analysis; Natural language processing; alcohol; behavior; health; social media; time-series; tobacco.