A Framework for Automatic Clustering of EHR Messages Using a Spatial Clustering Approach

Healthcare (Basel). 2023 Jan 30;11(3):390. doi: 10.3390/healthcare11030390.

Abstract

Although Health Level Seven (HL 7) message standards (v2, v3, Clinical Document Architecture (CDA)) have been commonly adopted, there are still issues associated with them, especially the semantic interoperability issues and lack of support for smart devices (e.g., smartphones, fitness trackers, and smartwatches), etc. In addition, healthcare organizations in many countries are still using proprietary electronic health record (EHR) message formats, making it challenging to convert to other data formats-particularly the latest HL7 Fast Health Interoperability Resources (FHIR) data standard. The FHIR is based on modern web technologies such as HTTP, XML, and JSON and would be capable of overcoming the shortcomings of the previous standards and supporting modern smart devices. Therefore, the FHIR standard could help the healthcare industry to avail the latest technologies benefits and improve data interoperability. The data representation and mapping from the legacy data standards (i.e., HL7 v2 and EHR) to the FHIR is necessary for the healthcare sector. However, direct data mapping or conversion from the traditional data standards to the FHIR data standard is challenging because of the nature and formats of the data. Therefore, in this article, we propose a framework that aims to convert proprietary EHR messages into the HL7 v2 format and apply an unsupervised clustering approach using the DBSCAN (density-based spatial clustering of applications with noise) algorithm to automatically group a variety of these HL7 v2 messages regardless of their semantic origins. The proposed framework's implementation lays the groundwork to provide a generic mapping model with multi-point and multi-format data conversion input into the FHIR. Our experimental results show the proposed framework's ability to automatically cluster various HL7 v2 message formats and provide analytic insight behind them.

Keywords: DBSCAN; EHR; FHIR; clustering; healthcare; machine learning.

Grants and funding

This research received no external funding.