Development and validation of the predictive risk of death model for adult patients admitted to intensive care units in Japan: an approach to improve the accuracy of healthcare quality measures

J Intensive Care. 2021 Feb 15;9(1):18. doi: 10.1186/s40560-021-00533-z.

Abstract

Background: The Acute Physiology and Chronic Health Evaluation (APACHE) III-j model is widely used to predict mortality in Japanese intensive care units (ICUs). Although the model's discrimination is excellent, its calibration is poor. APACHE III-j overestimates the risk of death, making its evaluation of healthcare quality inaccurate. This study aimed to improve the calibration of the model and develop a Japan Risk of Death (JROD) model for benchmarking purposes.

Methods: A retrospective analysis was conducted using a national clinical registry of ICU patients in Japan. Adult patients admitted to an ICU between April 1, 2018, and March 31, 2019, were included. The APACHE III-j model was recalibrated with the following models: Model 1, predicting mortality with an offset variable for the linear predictor of the APACHE III-j model using a generalized linear model; model 2, predicting mortality with the linear predictor of the APACHE III-j model using a generalized linear model; and model 3, predicting mortality with the linear predictor of the APACHE III-j model using a hierarchical generalized additive model. Model performance was assessed with the area under the receiver operating characteristic curve (AUROC), the Brier score, and the modified Hosmer-Lemeshow test. To confirm model applicability to evaluating quality of care, funnel plots of the standardized mortality ratio and exponentially weighted moving average (EWMA) charts for mortality were drawn.

Results: In total, 33,557 patients from 44 ICUs were included in the study population. ICU mortality was 3.8%, and hospital mortality was 8.1%. The AUROC, Brier score, and modified Hosmer-Lemeshow p value of the original model and models 1, 2, and 3 were 0.915, 0.062, and < .001; 0.915, 0.047, and < .001; 0.915, 0.047, and .002; and 0.917, 0.047, and .84, respectively. Except for model 3, the funnel plots showed overdispersion. The validity of the EWMA charts for the recalibrated models was determined by visual inspection.

Conclusions: Model 3 showed good performance and can be adopted as the JROD model for monitoring quality of care in an ICU, although further investigation of the clinical validity of outlier detection is required. This update method may also be useful in other settings.

Keywords: Benchmarking; Quality improvement; Quality indicator; Recalibration; Risk of death; Risk prediction model.