Exploiting document graphs for inter sentence relation extraction

J Biomed Semantics. 2022 Jun 3;13(1):15. doi: 10.1186/s13326-022-00267-3.

Abstract

Background: Most previous relation extraction (RE) studies have focused on intra sentence relations and have ignored relations that span sentences, i.e. inter sentence relations. Such relations connect entities at the document level rather than as relational facts in a single sentence. Extracting facts that are expressed across sentences leads to some challenges and requires different approaches than those usually applied in recent intra sentence relation extraction. Despite recent results, there are still limitations to be overcome.

Results: We present a novel representation for a sequence of consecutive sentences, namely document subgraph, to extract inter sentence relations. Experiments on the BioCreative V Chemical-Disease Relation corpus demonstrate the advantages and robustness of our novel system to extract both intra- and inter sentence relations in biomedical literature abstracts. The experimental results are comparable to state-of-the-art approaches and show the potential by demonstrating the effectiveness of graphs, deep learning-based model, and other processing techniques. Experiments were also carried out to verify the rationality and impact of various additional information and model components.

Conclusions: Our proposed graph-based representation helps to extract ∼50% of inter sentence relations and boosts the model performance on both precision and recall compared to the baseline model.

Keywords: Convolutional neural network; Deep learning; Graph; Multiple paths; Relation extraction.

MeSH terms

  • Publications*