Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text

J Healthc Eng. 2018 Apr 18:2018:2548537. doi: 10.1155/2018/2548537. eCollection 2018.

Abstract

Online medical text is full of references to medical entities (MEs), which are valuable in many applications, including medical knowledge-based (KB) construction, decision support systems, and the treatment of diseases. However, the diverse and ambiguous nature of the surface forms gives rise to a great difficulty for ME identification. Many existing solutions have focused on supervised approaches, which are often task-dependent. In other words, applying them to different kinds of corpora or identifying new entity categories requires major effort in data annotation and feature definition. In this paper, we propose unMERL, an unsupervised framework for recognizing and linking medical entities mentioned in Chinese online medical text. For ME recognition, unMERL first exploits a knowledge-driven approach to extract candidate entities from free text. Then, the categories of the candidate entities are determined using a distributed semantic-based approach. For ME linking, we propose a collaborative inference approach which takes full advantage of heterogenous entity knowledge and unstructured information in KB. Experimental results on real corpora demonstrate significant benefits compared to recent approaches with respect to both ME recognition and linking.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • China
  • Data Curation / methods*
  • Data Mining / methods*
  • Humans
  • Internet
  • Knowledge Bases
  • Medical Informatics / methods*
  • Semantics
  • Unsupervised Machine Learning*