Detecting and Analyzing Suicidal Ideation on Social Media Using Deep Learning and Machine Learning Models

Theyazn H H Aldhyani; Saleh Nagi Alsubari; Ali Saleh Alshebami; Hasan Alkahtani; Zeyad A T Ahmed

doi:10.3390/ijerph191912635

Detecting and Analyzing Suicidal Ideation on Social Media Using Deep Learning and Machine Learning Models

Int J Environ Res Public Health. 2022 Oct 3;19(19):12635. doi: 10.3390/ijerph191912635.

Authors

Theyazn H H Aldhyani¹, Saleh Nagi Alsubari², Ali Saleh Alshebami¹, Hasan Alkahtani³, Zeyad A T Ahmed²

Affiliations

¹ Applied College in Abqaiq, King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia.
² Department of Computer Science, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad 431004, India.
³ College of Computer Science and Information Technology, King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia.

Abstract

Individuals who suffer from suicidal ideation frequently express their views and ideas on social media. Thus, several studies found that people who are contemplating suicide can be identified by analyzing social media posts. However, finding and comprehending patterns of suicidal ideation represent a challenging task. Therefore, it is essential to develop a machine learning system for automated early detection of suicidal ideation or any abrupt changes in a user's behavior by analyzing his or her posts on social media. In this paper, we propose a methodology based on experimental research for building a suicidal ideation detection system using publicly available Reddit datasets, word-embedding approaches, such as TF-IDF and Word2Vec, for text representation, and hybrid deep learning and machine learning algorithms for classification. A convolutional neural network and Bidirectional long short-term memory (CNN-BiLSTM) model and the machine learning XGBoost model were used to classify social posts as suicidal or non-suicidal using textual and LIWC-22-based features by conducting two experiments. To assess the models' performance, we used the standard metrics of accuracy, precision, recall, and F1-scores. A comparison of the test results showed that when using textual features, the CNN-BiLSTM model outperformed the XGBoost model, achieving 95% suicidal ideation detection accuracy, compared with the latter's 91.5% accuracy. Conversely, when using LIWC features, XGBoost showed better performance than CNN-BiLSTM.

Keywords: LIWC-22; artificial intelligence; machine learning; suicidal ideation.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Deep Learning*
Female
Humans
Machine Learning
Male
Neural Networks, Computer
Social Media*
Suicidal Ideation

Grants and funding

This research and the APC were funded by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Project No. GRANT 365].