Exploring Representations for Singular and Multi-Concept Relations for Biomedical Named Entity Normalization

Proc Int World Wide Web Conf. 2022 Apr:2022:823-832. doi: 10.1145/3487553.3524701. Epub 2022 Aug 16.

Abstract

Since the rise of the COVID-19 pandemic, peer-reviewed biomedical repositories have experienced a surge in chemical and disease related queries. These queries have a wide variety of naming conventions and nomenclatures from trademark and generic, to chemical composition mentions. Normalizing or disambiguating these mentions within texts provides researchers and data-curators with more relevant articles returned by their search query. Named entity normalization aims to automate this disambiguation process by linking entity mentions onto their appropriate candidate concepts within a biomedical knowledge base or ontology. We explore several term embedding aggregation techniques in addition to how the term's context affects evaluation performance. We also evaluate our embedding approaches for normalizing term instances containing one or many relations within unstructured texts.

Keywords: MeSH identifier; concept linking; concept mapping; concept normalization; concept unique identifier; datasets; entity linking; entity normalization; named entity disambiguation; named entity linking; named entity normalization; neural networks; transformer; word embeddings.