Check-worthy claim detection across topics for automated fact-checking

Amani S Abumansour; Arkaitz Zubiaga

doi:10.7717/peerj-cs.1365

Check-worthy claim detection across topics for automated fact-checking

PeerJ Comput Sci. 2023 May 16:9:e1365. doi: 10.7717/peerj-cs.1365. eCollection 2023.

Authors

Amani S Abumansour^{1

2}, Arkaitz Zubiaga¹

Affiliations

¹ Queen Mary University of London, London, United Kingdom.
² Taif University, Taif, Saudi Arabia.

Abstract

An important component of an automated fact-checking system is the claim check-worthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifying check-worthy claims across different topics. In this article, we assess and quantify the challenge of detecting check-worthy claims for new, unseen topics. After highlighting the problem, we propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics. The AraCWA model enables boosting the performance for new topics by incorporating two components for few-shot learning and data augmentation. Using a publicly available dataset of Arabic tweets consisting of 14 different topics, we demonstrate that our proposed data augmentation strategy achieves substantial improvements across topics overall, where the extent of the improvement varies across topics. Further, we analyse the semantic similarities between topics, suggesting that the similarity metric could be used as a proxy to determine the difficulty level of an unseen topic prior to undertaking the task of labelling the underlying sentences.

Keywords: Automated fact-checking system; Check-worthiness; Check-worthy; Claim detection cross-topic.

Grants and funding

Amani S. Abumansour holds a scholarship from Taif University, Saudi Arabia. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.