Biomedical named entity normalization via interaction-based synonym marginalization

J Biomed Inform. 2022 Dec:136:104238. doi: 10.1016/j.jbi.2022.104238. Epub 2022 Nov 15.

Abstract

Objective: Biomedical named entity normalization (BNEN) is a fundamental natural language processing (NLP) task in the biomedical domain. Many representation learning-based methods have been successfully applied to BNEN in recent years. Most of them encode a given biomedical named entity mention (BNEM) and candidates separately, some of them consider relations between the BNEM and its candidates, however, few consider relations among the candidates, which may be useful for BNEN.

Material and methods: In this paper, we propose a novel interaction-based synonym marginalization for BNEN, which can capture both the relations between a given mention and the mention's candidates and that among the candidates, called IA-BIOSYN. In IA-BIOSYN, given a BNEM, a candidate selector is used to obtain the candidates of the BNEM dynamically, then an interaction module is used to model BNEM-candidate relations as well as candidate-candidate relations, and finally a synonym marginalization module is used to determine which candidate(s) the BNEM should be mapped to. To validate the effectiveness of our proposed method, we compare it with other state-of-the-art (SOTA) methods on three public BNEN datasets: NCBI-Disease, BC5CDR-Disease and BC5CDR-Chemical.

Results: Our proposed method achieves Acc@1 of 0.9333, 0.9379 and 0.9693 on NCBI-Disease, BC5CDR-Disease and BC5CDR-Chemical, respectively, significantly better than other SOTA methods.

Conclusions: Both the relations between a given BNEM and its candidates, and the relations among the candidates are useful for BNEN, and the proposed IA-BIOSYN can capture the two types of relations effectively.

Keywords: Biomedical named entity normalization; Biomedical named entity-candidate interaction; Candidate-candidate interaction; Synonym marginalization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Natural Language Processing*