The voice of Twitter: observable subjective well-being inferred from tweets in Russian

PeerJ Comput Sci. 2022 Dec 20:8:e1181. doi: 10.7717/peerj-cs.1181. eCollection 2022.

Abstract

As one of the major platforms of communication, social networks have become a valuable source of opinions and emotions. Considering that sharing of emotions offline and online is quite similar, historical posts from social networks seem to be a valuable source of data for measuring observable subjective well-being (OSWB). In this study, we calculated OSWB indices for the Russian-speaking segment of Twitter using the Affective Social Data Model for Socio-Technical Interactions. This model utilises demographic information and post-stratification techniques to make the data sample representative, by selected characteristics, of the general population of a country. For sentiment analysis, we fine-tuned RuRoBERTa-Large on RuSentiTweet and achieved new state-of-the-art results of F1 = 0.7229. Several calculated OSWB indicators demonstrated moderate Spearman's correlation with the traditional survey-based net affect (rs = 0.469 and rs = 0.5332, p < 0.05) and positive affect (rs = 0.5177 and rs = 0.548, p < 0.05) indices in Russia.

Keywords: Computational social science; Happiness index; Language models; Machine learning; Observable subjective well-being; Sentiment analysis; Subjective well-being; User-generated content; social networks.

Grants and funding

The authors received no funding for this work.