A Syntax-enhanced model based on category keywords for biomedical relation extraction

J Biomed Inform. 2022 Aug:132:104135. doi: 10.1016/j.jbi.2022.104135. Epub 2022 Jul 14.

Abstract

Certain categories in multi-category biomedical relationship extraction have linguistic similarities to some extent. Keywords related to categories and syntax structures of samples between these categories have some notable features, which are very useful in biomedical relation extraction. The pre-trained model has been widely used and has achieved great success in biomedical relationship extraction, but it is still incapable of mining this kind of information accurately. To solve the problem, we present a syntax-enhanced model based on category keywords. First, we prune syntactic dependency trees in terms of category keywords obtained by the chi-square test. It reduces noisy information caused by current syntactic parsing tools and retains useful information related to categories. Next, to encode category-related syntactic dependency trees, a syntactic transformer is presented, which enhances the ability of the pre-trained model to capture syntax structures and to distinguish multiple categories. We evaluate our method on three biomedical datasets. Compared with state-of-the-art models, our method performs better on these datasets. We conduct further analysis to verify the effectiveness of our method.

Keywords: Category keywords; Multi-category biomedical relation extraction; Syntactic dependency tree; Syntax-enhanced model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Linguistics*