Automatic anxiety recognition method based on microblog text analysis

Yang Yu; Qi Li; Xiaoqian Liu

doi:10.3389/fpubh.2023.1080013

Automatic anxiety recognition method based on microblog text analysis

Front Public Health. 2023 Mar 20:11:1080013. doi: 10.3389/fpubh.2023.1080013. eCollection 2023.

Authors

Yang Yu^{1

2}, Qi Li², Xiaoqian Liu^{1

3}

Affiliations

¹ Institute of Psychology, Chinese Academy of Sciences, Beijing, China.
² Learning and Cognition Key Laboratory of Beijing, School of Psychology, Capital Normal University, Beijing, China.
³ Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.

Abstract

Mental health has traditionally been assessed using a self-report questionnaire. Although this approach produces accurate results, it has the disadvantage of being labor-intense and time-consuming. This study aimed to extract original text information published by users on the social media platform (Sina Weibo). A machine learning method was used to train the model and predict the anxiety state of the user automatically. Data of 1,039 users were collected. First, Weibo users were invited to fill the anxiety self-assessment scale. All original text data ever published by the users were collected. Second, the Simplified Chinese-Linguistic Inquiry and Word Count (SC-LIWC) were extracted for feature selection and model training. We found that the model achieved the best performance when the XGBoostRegressor algorithm was used. The Pearson correlation coefficient between the model predicted scores and self-reported scores was moderate (r = 0.322). In addition, we tested the reliability of the model, and found that the model had high reliability (r = 0.72). The experimental results further showed that the model was feasible and effective and could use the digital footprints to predict psychological characteristics.

Keywords: SC-LIWC; Weibo data; anxiety recognition; machine learning; social media platform.

MeSH terms

Anxiety
Anxiety Disorders
Humans
Language*
Linguistics*
Reproducibility of Results