Text Sentiment Classification Based on BERT Embedding and Sliced Multi-Head Self-Attention Bi-GRU

Sensors (Basel). 2023 Jan 28;23(3):1481. doi: 10.3390/s23031481.

Abstract

In the task of text sentiment analysis, the main problem that we face is that the traditional word vectors represent lack of polysemy, the Recurrent Neural Network cannot be trained in parallel, and the classification accuracy is not high. We propose a sentiment classification model based on the proposed Sliced Bidirectional Gated Recurrent Unit (Sliced Bi-GRU), Multi-head Self-Attention mechanism, and Bidirectional Encoder Representations from Transformers embedding. First, the word vector representation obtained by the BERT pre-trained language model is used as the embedding layer of the neural network. Then the input sequence is sliced into subsequences of equal length. And the Bi-sequence Gated Recurrent Unit is applied to extract the subsequent feature information. The relationship between words is learned sequentially via the Multi-head Self-attention mechanism. Finally, the emotional tendency of the text is output by the Softmax function. Experiments show that the classification accuracy of this model on the Yelp 2015 dataset and the Amazon dataset is 74.37% and 62.57%, respectively. And the training speed of the model is better than most existing models, which verifies the effectiveness of the model.

Keywords: BERT word vector; bidirectional slice-gated recurrent unit; multi-head self-attention mechanism; sentiment classification.