CPBA-CLIM: An entity-relation extraction model for ontology-based knowledge graph construction in hazardous chemical incident management

Sci Prog. 2024 Jan-Mar;107(1):368504241235510. doi: 10.1177/00368504241235510.

Abstract

In recent years, hazardous chemical incidents have occurred frequently, resulting in significant human casualties, property damage, and environmental pollution due to human or natural factors. Accurately mining the lessons learned from accumulating incident reports and constructing the knowledge graph for hazardous chemical incident management can assist managers in identifying patterns and analyzing common attributes, thereby preventing the recurrence of similar incidents. This article addresses the challenges of dispersed textual information, specialized vocabulary, and data formats in hazardous chemical incidents. We propose a novel entity-relation extraction model called CPBA-CLIM (content-position-based attention-cross-label intersect matching) to provide an accurate data foundation for constructing the hazardous chemical incident knowledge graph. The content-position-based attention module, based on content-position attention, incorporates contextual semantic information into the combined encoding of bidirectional encoder representations from the transformer's content and position to obtain dynamic word vectors that align with the thematic context of the text. Additionally, the cross-label intersect matching strategy evaluates the rationality of entity-relation interactions in sets containing potential overlaps, reducing the impact of entity-relation overlap on triplet extraction accuracy. Comparative experimental results on public datasets demonstrate the model's outstanding performance in overlapping triplets. Qualitative experiments on a self-constructed dataset integrate our model with ontology construction techniques, successfully establishing a knowledge graph for managing hazardous chemical incidents. This research effectively enhances the degree of automation and efficiency in knowledge graph construction, thus offering support and decision-making foundations for hazardous chemical safety management.

Keywords: Contextual semantic; content-position-based attention module; cross-label intersect matching strategy; overlapping triplets.