A systematic literature review on spam content detection and classification

PeerJ Comput Sci. 2022 Jan 20:8:e830. doi: 10.7717/peerj-cs.830. eCollection 2022.

Abstract

The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection.

Keywords: Classification; Data mining; Deep learning; Machine learning; Natural language processing; Social media analysis; Spam Content; Text mining.

Grants and funding

This work was funded by Zayed University–Start-up research grant (Grant number R20081). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.