Identification of Similar Patients Through Medical Concept Embedding from Electronic Health Records: A Feasibility Study for Rare Disease Diagnosis

Stud Health Technol Inform. 2021 May 27:281:600-604. doi: 10.3233/SHTI210241.

Abstract

To identify patients with similar clinical profiles and derive insights from the records and outcomes of similar patients can help fast and precise diagnosis and other clinical decisions for rare diseases. Similarity methods are required to take into account the semantic relations between medical concepts and also the different relevance of all medical concepts presented in patients' medical records. In this paper, we introduce the methods developed in the context of rare disease screening/diagnosis from clinical data warehouse using medical concept embedding and adjusted aggregations. Our methods provided better preliminary results than baseline methods, with a significant improvement of precision among the top ranked similar patients, which is encouraging for further fine-tuning and application on a large-scale dataset for new/candidate patient identification.

Keywords: Electronic Health Records; Patient similarity; rare disease diagnosis; word embedding.

MeSH terms

  • Data Warehousing
  • Electronic Health Records*
  • Feasibility Studies
  • Humans
  • Rare Diseases* / diagnosis
  • Semantics