ProTect: a hybrid deep learning model for proactive detection of cyberbullying on social media

T Nitya Harshitha; M Prabu; E Suganya; S Sountharrajan; Durga Prasad Bavirisetti; Navya Gadde; Lakshmi Sahithi Uppu

doi:10.3389/frai.2024.1269366

ProTect: a hybrid deep learning model for proactive detection of cyberbullying on social media

Front Artif Intell. 2024 Mar 6:7:1269366. doi: 10.3389/frai.2024.1269366. eCollection 2024.

Authors

T Nitya Harshitha¹, M Prabu¹, E Suganya², S Sountharrajan¹, Durga Prasad Bavirisetti³, Navya Gadde¹, Lakshmi Sahithi Uppu¹

Affiliations

¹ Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Chennai, India.
² Department of Information Technology, Sri Sivasubramaniya Nadar College of Engineering, Chennai, India.
³ Department of Computer Science, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.

Abstract

The emergence of social media has given rise to a variety of networking and communication opportunities, as well as the well-known issue of cyberbullying, which is continuously on the rise in the current world. Researchers have been actively addressing cyberbullying for a long time by applying machine learning and deep learning techniques. However, although these algorithms have performed well on artificial datasets, they do not provide similar results when applied to real-time datasets with high levels of noise and imbalance. Consequently, finding generic algorithms that can work on dynamic data available across several platforms is critical. This study used a unique hybrid random forest-based CNN model for text classification, combining the strengths of both approaches. Real-time datasets from Twitter and Instagram were collected and annotated to demonstrate the effectiveness of the proposed technique. The performance of various ML and DL algorithms was compared, and the RF-based CNN model outperformed them in accuracy and execution speed. This is particularly important for timely detection of bullying episodes and providing assistance to victims. The model achieved an accuracy of 96% and delivered results 3.4 seconds faster than standard CNN models.

Keywords: cyber bullying; data mining; deep learning; machine learning; neural network; social media; text analysis.

Grants and funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.