Machine learning for email spam filtering: review, approaches and open research problems

Emmanuel Gbenga Dada; Joseph Stephen Bassi; Haruna Chiroma; Shafi'i Muhammad Abdulhamid; Adebayo Olusola Adetunmbi; Opeyemi Emmanuel Ajibuwa

doi:10.1016/j.heliyon.2019.e01802

Machine learning for email spam filtering: review, approaches and open research problems

Heliyon. 2019 Jun 10;5(6):e01802. doi: 10.1016/j.heliyon.2019.e01802. eCollection 2019 Jun.

Authors

Emmanuel Gbenga Dada¹, Joseph Stephen Bassi¹, Haruna Chiroma², Shafi'i Muhammad Abdulhamid³, Adebayo Olusola Adetunmbi⁴, Opeyemi Emmanuel Ajibuwa⁵

Affiliations

¹ Department of Computer Engineering, University of Maiduguri, Maiduguri, Nigeria.
² Department of Computer Science, Federal College of Education (Technical), Gombe, Nigeria.
³ Department of Cyber Security Science, Federal University of Technology Minna, Minna, Nigeria.
⁴ Department of Computer Science, Federal University of Technology Akure, Akure, Nigeria.
⁵ Department of Electrical Engineering, University of Ilorin, Ilorin, Nigeria.

Abstract

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep leaning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails.

Keywords: Analysis of algorithms; Computer privacy; Computer science; Computer security; Deep learning; Machine learning; Naïve Bayes; Neural networks; Spam filtering; Support vector machines.