A lexicon-based approach to examine depression detection in social media: the case of Twitter and university community

Humanit Soc Sci Commun. 2022;9(1):325. doi: 10.1057/s41599-022-01313-2. Epub 2022 Sep 21.

Abstract

Globally, the number of people who suffer from depression is consistently increasing. Because both detecting and addressing the early stage of depression is one of the strongest factors for effective treatment, a number of scholars have attempted to examine how to detect and address early-stage depression. Recent studies have been focusing on the use of social media for depression detection where users express their thoughts and emotions freely. With this trend, we examine two-step approaches for early-stage depression detection. First, we propose a depression post-classification model using multiple languages Twitter datasets (Korean, English, and Japanese) to improve the applicability of the proposed model. Moreover, we built a depression lexicon for each language, which mental health experts verified. Then, we applied the proposed model to a more specific user group dataset, a community of university students (Everytime), to examine whether the model can be employed to address depression posts in more specific user groups. The classification results present that the proposed model and approach can effectively detect depression posts of a general user group (Twitter), as well as specific user group datasets. Moreover, the implemented models and datasets are publicly available.

Keywords: Cultural and media studies; Health humanities.