Building Fake Review Detection Model Based on Sentiment Intensity and PU Learning

IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):6926-6939. doi: 10.1109/TNNLS.2023.3234427. Epub 2023 Oct 5.

Abstract

Fake review detection has the characteristics of huge stream data processing scale, unlimited data increment, dynamic change, and so on. However, the existing fake review detection methods mainly target limited and static review data. In addition, deceptive fake reviews have always been a difficult point in fake review detection due to their hidden and diverse characteristics. To solve the above problems, this article proposes a fake review detection model based on sentiment intensity and PU learning (SIPUL), which can continuously learn the prediction model from the constantly arriving streaming data. First, when the streaming data arrive, the sentiment intensity is introduced to divide the reviews into different subsets (i.e., strong sentiment set and weak sentiment set). Then, the initial positive and negative samples are extracted from the subset using the marking mechanism of selection completely at random (SCAR) and Spy technology. Second, building a semi-supervised positive-unlabeled (PU) learning detector based on the initial sample to detect fake reviews in the data stream iteratively. According to the detection results, the data of initial samples and the PU learning detector are continuously updated. Finally, the old data are continually deleted according to the historical record points, so that the training sample data are within a manageable size and prevent overfitting. Experimental results show that the model can effectively detect fake reviews, especially deceptive reviews.