High resolution data modifies intensive care unit dialysis outcome predictions as compared with low resolution administrative data set

Jennifer Ziegler; Barret N M Rush; Eric R Gottlieb; Leo Anthony Celi; Miguel Ángel Armengol de la Hoz

doi:10.1371/journal.pdig.0000124

High resolution data modifies intensive care unit dialysis outcome predictions as compared with low resolution administrative data set

PLOS Digit Health. 2022 Oct 11;1(10):e0000124. doi: 10.1371/journal.pdig.0000124. eCollection 2022 Oct.

Authors

Jennifer Ziegler¹, Barret N M Rush¹, Eric R Gottlieb^{2

3

4}, Leo Anthony Celi^{3

4

5

6}, Miguel Ángel Armengol de la Hoz^{4

7

8}

Affiliations

¹ Department of Internal Medicine, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.
² Department of Medicine, Mount Auburn Hospital, Cambridge, Massachusetts, United States of America.
³ Harvard Medical School, Boston, Massachusetts, United States of America.
⁴ Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America.
⁵ Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America.
⁶ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.
⁷ Department of Anesthesia, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States of America.
⁸ Big Data Department, Fundacion Progreso y Salud, Regional Ministry of Health of Andalucia.

Abstract

High resolution clinical databases from electronic health records are increasingly being used in the field of health data science. Compared to traditional administrative databases and disease registries, these newer highly granular clinical datasets offer several advantages, including availability of detailed clinical information for machine learning and the ability to adjust for potential confounders in statistical models. The purpose of this study is to compare the analysis of the same clinical research question using an administrative database and an electronic health record database. The Nationwide Inpatient Sample (NIS) was used for the low-resolution model, and the eICU Collaborative Research Database (eICU) was used for the high-resolution model. A parallel cohort of patients admitted to the intensive care unit (ICU) with sepsis and requiring mechanical ventilation was extracted from each database. The primary outcome was mortality and the exposure of interest was the use of dialysis. In the low resolution model, after controlling for the covariates that are available, dialysis use was associated with an increased mortality (eICU: OR 2.07, 95% CI 1.75-2.44, p<0.01; NIS: OR 1.40, 95% CI 1.36-1.45, p<0.01). In the high-resolution model, after the addition of the clinical covariates, the harmful effect of dialysis on mortality was no longer significant (OR 1.04, 95% 0.85-1.28, p = 0.64). The results of this experiment show that the addition of high resolution clinical variables to statistical models significantly improves the ability to control for important confounders that are not available in administrative datasets. This suggests that the results from prior studies using low resolution data may be inaccurate and may need to be repeated using detailed clinical data.

Copyright: © 2022 Ziegler et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Grants and funding

Research reported in this publication was supported by the National Institute of Health grants T32DK007527 (ERG) and NIBIB R01EB017205 (LAC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.