Comparison of Machine Learning and Sentiment Analysis in Detection of Suspicious Online Reviewers on Different Type of Data

Sensors (Basel). 2021 Dec 27;22(1):155. doi: 10.3390/s22010155.

Abstract

The article focuses on solving an important problem of detecting suspicious reviewers in online discussions on social networks. We have concentrated on a special type of suspicious authors, on trolls. We have used methods of machine learning for generation of detection models to discriminate a troll reviewer from a common reviewer, but also methods of sentiment analysis to recognize the sentiment typical for troll's comments. The sentiment analysis can be provided also using machine learning or lexicon-based approach. We have used lexicon-based sentiment analysis for its better ability to detect a dictionary typical for troll authors. We have achieved Accuracy = 0.95 and F1 = 0.80 using sentiment analysis. The best results using machine learning methods were achieved by support vector machine, Accuracy = 0.986 and F1 = 0.988, using a dataset with the set of all selected attributes. We can conclude that detection model based on machine learning is more successful than lexicon-based sentiment analysis, but the difference in accuracy is not so large as in F1 measure.

Keywords: detection of a troll; machine learning; sentiment analysis; text data processing; web mining.

MeSH terms

  • Attitude
  • Machine Learning*
  • Sentiment Analysis*
  • Support Vector Machine