Early Identification of Maternal Cardiovascular Risk Through Sourcing and Preparing Electronic Health Record Data: Machine Learning Study

JMIR Med Inform. 2022 Feb 10;10(2):e34932. doi: 10.2196/34932.

Abstract

Background: Health care data are fragmenting as patients seek care from diverse sources. Consequently, patient care is negatively impacted by disparate health records. Machine learning (ML) offers a disruptive force in its ability to inform and improve patient care and outcomes. However, the differences that exist in each individual's health records, combined with the lack of health data standards, in addition to systemic issues that render the data unreliable and that fail to create a single view of each patient, create challenges for ML. Although these problems exist throughout health care, they are especially prevalent within maternal health and exacerbate the maternal morbidity and mortality crisis in the United States.

Objective: This study aims to demonstrate that patient records extracted from the electronic health records (EHRs) of a large tertiary health care system can be made actionable for the goal of effectively using ML to identify maternal cardiovascular risk before evidence of diagnosis or intervention within the patient's record. Maternal patient records were extracted from the EHRs of a large tertiary health care system and made into patient-specific, complete data sets through a systematic method.

Methods: We outline the effort that was required to define the specifications of the computational systems, the data set, and access to relevant systems, while ensuring that data security, privacy laws, and policies were met. Data acquisition included the concatenation, anonymization, and normalization of health data across multiple EHRs in preparation for their use by a proprietary risk stratification algorithm designed to establish patient-specific baselines to identify and establish cardiovascular risk based on deviations from the patient's baselines to inform early interventions.

Results: Patient records can be made actionable for the goal of effectively using ML, specifically to identify cardiovascular risk in pregnant patients.

Conclusions: Upon acquiring data, including their concatenation, anonymization, and normalization across multiple EHRs, the use of an ML-based tool can provide early identification of cardiovascular risk in pregnant patients.

Keywords: artificial intelligence; cardiovascular risk; data transformation; electronic health record; electronic medical record; extract; load; machine learning; maternal health; maternal morbidity and mortality; transform.