Advancing Phishing Email Detection: A Comparative Study of Deep Learning Models

Najwa Altwaijry; Isra Al-Turaiki; Reem Alotaibi; Fatimah Alakeel

doi:10.3390/s24072077

Advancing Phishing Email Detection: A Comparative Study of Deep Learning Models

Sensors (Basel). 2024 Mar 24;24(7):2077. doi: 10.3390/s24072077.

Authors

Najwa Altwaijry¹, Isra Al-Turaiki¹, Reem Alotaibi², Fatimah Alakeel³

Affiliations

¹ Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11653, Saudi Arabia.
² Information Technology Department, College of Computer and Information Sciences, King Saud University, Riyadh 11451, Saudi Arabia.
³ Department of Computer Science and Engineering, College of Applied Studies and Community Service, King Saud University, Riyadh 11495, Saudi Arabia.

Abstract

Phishing is one of the most dangerous attacks targeting individuals, organizations, and nations. Although many traditional methods for email phishing detection exist, there is a need to improve accuracy and reduce false-positive rates. Our work investigates one-dimensional CNN-based models (1D-CNNPD) to detect phishing emails in order to address these challenges. Additionally, further improvement is achieved with the augmentation of the base 1D-CNNPD model with recurrent layers, namely, LSTM, Bi-LSTM, GRU, and Bi-GRU, and experimented with the four resulting models. Two benchmark datasets were used to evaluate the performance of our models: Phishing Corpus and Spam Assassin. Our results indicate that, in general, the augmentations improve the performance of the 1D-CNNPD base model. Specifically, the 1D-CNNPD with Bi-GRU yields the best results. Overall, the performance of our models is comparable to the state of the art of CNN-based phishing email detection. The Advanced 1D-CNNPD with Leaky ReLU and Bi-GRU achieved 100% precision, 99.68% accuracy, an F1 score of 99.66%, and a recall of 99.32%. We observe that increasing model depth typically leads to an initial performance improvement, succeeded by a decline. In conclusion, this study highlights the effectiveness of augmented 1D-CNNPD models in detecting phishing emails with improved accuracy. The reported performance measure values indicate the potential of these models in advancing the implementation of cybersecurity solutions to combat email phishing attacks.

Keywords: BiGRU; BiLSTM; LSTM; convolutional neural networks (CNN); deep learning; email phishing.

Grants and funding

Researchers Supporting Project number (RSPD2023R857/King Saud University