Multi-modal affine fusion network for social media rumor detection

PeerJ Comput Sci. 2022 May 3:8:e928. doi: 10.7717/peerj-cs.928. eCollection 2022.

Abstract

With the rapid development of the Internet, people obtain much information from social media such as Twitter and Weibo every day. However, due to the complex structure of social media, many rumors with corresponding images are mixed in factual information to be widely spread, which misleads readers and exerts adverse effects on society. Automatically detecting social media rumors has become a challenge faced by contemporary society. To overcome this challenge, we proposed the multimodal affine fusion network (MAFN) combined with entity recognition, a new end-to-end framework that fuses multimodal features to detect rumors effectively. The MAFN mainly consists of four parts: the entity recognition enhanced textual feature extractor, the visual feature extractor, the multimodal affine fuser, and the rumor detector. The entity recognition enhanced textual feature extractor is responsible for extracting textual features that enhance semantics with entity recognition from posts. The visual feature extractor extracts visual features. The multimodal affine fuser extracts the three types of modal features and fuses them by the affine method. It cooperates with the rumor detector to learn the representations for rumor detection to produce reliable fusion detection. Extensive experiments were conducted on the MAFN based on real Weibo and Twitter multimodal datasets, which verified the effectiveness of the proposed multimodal fusion neural network in rumor detection.

Keywords: Computer vision; Deep learning; Multimodality; Rumor detection; Social media fraud.

Grants and funding

This work is supported by The National Natural Science Foundation of China (Grant No. 61572459). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.