Retweet communities reveal the main sources of hate speech

Bojan Evkoski; Andraž Pelicon; Igor Mozetič; Nikola Ljubešić; Petra Kralj Novak

doi:10.1371/journal.pone.0265602

Retweet communities reveal the main sources of hate speech

PLoS One. 2022 Mar 17;17(3):e0265602. doi: 10.1371/journal.pone.0265602. eCollection 2022.

Authors

Bojan Evkoski^{1

2}, Andraž Pelicon^{1

2}, Igor Mozetič¹, Nikola Ljubešić^{1

3}, Petra Kralj Novak¹

Affiliations

¹ Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia.
² Jozef Stefan International Postgraduate School, Ljubljana, Slovenia.
³ Faculty of Information and Communication Sciences, University of Ljubljana, Ljubljana, Slovenia.

Abstract

We address a challenging problem of identifying main sources of hate speech on Twitter. On one hand, we carefully annotate a large set of tweets for hate speech, and deploy advanced deep learning to produce high quality hate speech classification models. On the other hand, we create retweet networks, detect communities and monitor their evolution through time. This combined approach is applied to three years of Slovenian Twitter data. We report a number of interesting results. Hate speech is dominated by offensive tweets, related to political and ideological issues. The share of unacceptable tweets is moderately increasing with time, from the initial 20% to 30% by the end of 2020. Unacceptable tweets are retweeted significantly more often than acceptable tweets. About 60% of unacceptable tweets are produced by a single right-wing community of only moderate size. Institutional Twitter accounts and media accounts post significantly less unacceptable tweets than individual accounts. In fact, the main sources of unacceptable tweets are anonymous accounts, and accounts that were suspended or closed during the years 2018-2020.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Communications Media*
Hate
Humans
Language
Social Media*
Speech

Grants and funding

The authors acknowledge financial support from the Slovenian Research Agency (research core funding no. P2-103 and P6-0411), the Slovenian Research Agency and the Flemish Research Foundation bilateral research project LiLaH (grant no. ARRS-N6-0099 and FWO-G070619N), and the European Union’s Rights, Equality and Citizenship Programme (2014-2020) project IMSyPP (grant no. 875263). The European Commission’s support for the production of this publication does not constitute an endorsement of the contents, which reflect the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein.