Integrating Structured and Unstructured EHR Data Using an FHIR-based Type System: A Case Study with Medication Data

Na Hong; Andrew Wen; Feichen Shen; Sunghwan Sohn; Sijia Liu; Hongfang Liu; Guoqian Jiang

Integrating Structured and Unstructured EHR Data Using an FHIR-based Type System: A Case Study with Medication Data

AMIA Jt Summits Transl Sci Proc. 2018 May 18:2017:74-83. eCollection 2018.

Authors

Na Hong¹, Andrew Wen¹, Feichen Shen¹, Sunghwan Sohn¹, Sijia Liu¹, Hongfang Liu¹, Guoqian Jiang¹

Affiliation

¹ Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.

PMID: 29888045
PMCID: PMC5961797

Abstract

Standards-based modeling of electronic health records (EHR) data holds great significance for data interoperability and large-scale usage. Integration of unstructured data into a standard data model, however, poses unique challenges partially due to heterogeneous type systems used in existing clinical NLP systems. We introduce a scalable and standards-based framework for integrating structured and unstructured EHR data leveraging the HL7 Fast Healthcare Interoperability Resources (FHIR) specification. We implemented a clinical NLP pipeline enhanced with an FHIR-based type system and performed a case study using medication data from Mayo Clinic's EHR. Two UIMA-based NLP tools known as MedXN and MedTime were integrated in the pipeline to extract FHIR MedicationStatement resources and related attributes from unstructured medication lists. We developed a rule-based approach for assigning the NLP output types to the FHIR elements represented in the type system, whereas we investigated the FHIR elements belonging to the source of the structured EMR data. We used the FHIR resource "MedicationStatement" as an example to illustrate our integration framework and methods. For evaluation, we manually annotated FHIR elements in 166 medication statements from 14 clinical notes generated by Mayo Clinic in the course of patient care, and used standard performance measures (precision, recall and f-measure). The F-scores achieved ranged from 0.73 to 0.99 for the various FHIR element representations. The results demonstrated that our framework based on the FHIR type system is feasible for normalizing and integrating both structured and unstructured EHR data.

Grants and funding

U01 HG009450/HG/NHGRI NIH HHS/United States