Real-time imputation of missing predictor values in clinical practice

Steven W J Nijman; Jeroen Hoogland; T Katrien J Groenhof; Menno Brandjes; John J L Jacobs; Michiel L Bots; Folkert W Asselbergs; Karel G M Moons; Thomas P A Debray

doi:10.1093/ehjdh/ztaa016

Real-time imputation of missing predictor values in clinical practice

Eur Heart J Digit Health. 2020 Dec 19;2(1):154-164. doi: 10.1093/ehjdh/ztaa016. eCollection 2021 Mar.

Authors

Affiliations

¹ Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands.
² Department of Health, Ortec B.V., Zoetermeer, Houtsingel 5, 2719 EA Zoetermeer, The Netherlands.
³ Department of Cardiology, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands.
⁴ Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, 62 Huntley St, Fitzrovia, London WC1E 6DD, UK.
⁵ Health Data Research UK, Institute of Health Informatics, University College London, Gibbs Building, 215 Euston Rd, London NW1 2BE, UK.

Abstract

Aims: Use of prediction models is widely recommended by clinical guidelines, but usually requires complete information on all predictors, which is not always available in daily practice. We aim to describe two methods for real-time handling of missing predictor values when using prediction models in practice.

Methods and results: We compare the widely used method of mean imputation (M-imp) to a method that personalizes the imputations by taking advantage of the observed patient characteristics. These characteristics may include both prediction model variables and other characteristics (auxiliary variables). The method was implemented using imputation from a joint multivariate normal model of the patient characteristics (joint modelling imputation; JMI). Data from two different cardiovascular cohorts with cardiovascular predictors and outcome were used to evaluate the real-time imputation methods. We quantified the prediction model's overall performance [mean squared error (MSE) of linear predictor], discrimination (c-index), calibration (intercept and slope), and net benefit (decision curve analysis). When compared with mean imputation, JMI substantially improved the MSE (0.10 vs. 0.13), c-index (0.70 vs. 0.68), and calibration (calibration-in-the-large: 0.04 vs. 0.06; calibration slope: 1.01 vs. 0.92), especially when incorporating auxiliary variables. When the imputation method was based on an external cohort, calibration deteriorated, but discrimination remained similar.

Conclusions: We recommend JMI with auxiliary variables for real-time imputation of missing values, and to update imputation models when implementing them in new settings or (sub)populations.

Keywords: Computerized decision support system; Electronic health records; Joint modelling imputation; Missing data; Prediction; Real-time imputation.