FGSI: distant supervision for relation extraction method based on fine-grained semantic information

Sci Rep. 2023 Aug 28;13(1):14075. doi: 10.1038/s41598-023-41354-4.

Abstract

Relation extraction is one of the important steps in building a knowledge graph. Its main objective is to extract semantic relationships from identified entity pairs in sentences, playing a crucial role in semantic understanding and knowledge graph construction. Remote supervised relation extraction aligns knowledge bases with natural language texts and generates labeled data, which alleviates the burden of manually annotating datasets. However, the labeled corpus obtained from remote supervision contains a large amount of noisy data, which greatly affects the training of relation extraction models. In this paper, we propose the hypothesis that key semantic information within the sentence plays a crucial role in entity relation extraction in the task of remote supervised relation extraction. Based on this hypothesis, we divide the sentence into three segments by splitting it according to the positions of entities, starting from within the sentence. Then, using intra-sentence attention mechanisms, we identify fine-grained semantic features within the sentence to reduce the interference of irrelevant noise information. We also improved the intra-bag attention mechanism by setting a threshold gate to filter out low-relevant noisy sentences, minimizing the impact of noise on the relation extraction model, and making full use of available positive semantic information. Experimental results show that the proposed relation extraction model in this paper achieves improvements in precision-recall curve, P@N value, and AUC value compared to existing methods, demonstrating the effectiveness of this model.