Entity relationship extraction from Chinese electronic medical records based on feature augmentation and cascade binary tagging framework

Math Biosci Eng. 2024 Jan;21(1):1342-1355. doi: 10.3934/mbe.2024058. Epub 2022 Dec 27.

Abstract

Extracting entity relations from unstructured Chinese electronic medical records is an important task in medical information extraction. However, Chinese electronic medical records mostly have document-level volumes, and existing models are either unable to handle long text sequences or exhibit poor performance. This paper proposes a neural network based on feature augmentation and cascade binary tagging framework. First, we utilize a pre-trained model to tokenize the original text and obtain word embedding vectors. Second, the word vectors are fed into the feature augmentation network and fused with the original features and position features. Finally, the cascade binary tagging decoder generates the results. In the current work, we built a Chinese document-level electronic medical record dataset named VSCMeD, which contains 595 real electronic medical records from vascular surgery patients. The experimental results show that the model achieves a precision of 87.82% and recall of 88.47%. It is also verified on another Chinese medical dataset CMeIE-V2 that the model achieves a precision of 54.51% and recall of 48.63%.

Keywords: Chinese electronic medical records; biomedical information processing; deep learning; natural language processing; neural networks.

MeSH terms

  • China
  • Electronic Health Records*
  • Humans
  • Information Storage and Retrieval
  • Neural Networks, Computer*