A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis

Zhanfeng Wang; Lisha Yao; Xiaoyu Shao; Honghai Wang

doi:10.1007/s10878-023-01038-1

A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis

J Comb Optim. 2023;45(4):109. doi: 10.1007/s10878-023-01038-1. Epub 2023 May 11.

Authors

Zhanfeng Wang¹, Lisha Yao², Xiaoyu Shao², Honghai Wang¹

Affiliations

¹ School of Computer Science and Artificial Intelligence, Chaohu University, Hefei, 238024 Anhui China.
² School of Big Data and Artificial Intelligence, Anhui Xinhua University, Hefei, 230088 Anhui China.

Abstract

More and more individuals are paying attention to the research on the emotional information found in micro-blog comments. TEXTCNN is growing rapidly in the short text space. However, because the training model of TEXTCNN model itself is not very extensible and interpretable, it is difficult to quantify and evaluate the relative importance of features and themselves. At the same time, word embedding can't solve the problem of polysemy at one time. This research suggests a microblog sentiment analysis method based on TEXTCNN and Bayes that addresses this flaw. First, the word embedding vector is obtained by word2vec tool, and based on the word vector, the ELMo word vector integrating contextual features and different semantic features is generated by ELMo model. Second, the local features of ELMo word vector are extracted from multiple angles by using the convolution layer and pooling layer of TEXTCNN model. Finally, the training task of emotion data classification is completed by combining Bayes classifier. On the Stanford Sentiment Classification Corpus data set SST (Stanford Sentiment Classification Corpus Data bank), the experimental findings demonstrate that the model in this paper is compared with TEXTCNN, LSTM, and LSTM-TEXTCNN models. The Accuracy, Precision, Recall, and F1-score of the experimental results of this research have all greatly increased. Their values are respectively 0.9813, 0.9821, 0.9804 and 0.9812, which are superior to other comparison models and can be effectively used for emotional accurate analysis and identification of events in microblog emotion analysis.

Keywords: Bayesian classifier; ELMo (Embedding from language models); Sentiment analysis; TEXTCNN.

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.