A nonparametric updating method to correct clinical prediction model drift

J Am Med Inform Assoc. 2019 Dec 1;26(12):1448-1457. doi: 10.1093/jamia/ocz127.

Abstract

Objective: Clinical prediction models require updating as performance deteriorates over time. We developed a testing procedure to select updating methods that minimizes overfitting, incorporates uncertainty associated with updating sample sizes, and is applicable to both parametric and nonparametric models.

Materials and methods: We describe a procedure to select an updating method for dichotomous outcome models by balancing simplicity against accuracy. We illustrate the test's properties on simulated scenarios of population shift and 2 models based on Department of Veterans Affairs inpatient admissions.

Results: In simulations, the test generally recommended no update under no population shift, no update or modest recalibration under case mix shifts, intercept correction under changing outcome rates, and refitting under shifted predictor-outcome associations. The recommended updates provided superior or similar calibration to that achieved with more complex updating. In the case study, however, small update sets lead the test to recommend simpler updates than may have been ideal based on subsequent performance.

Discussion: Our test's recommendations highlighted the benefits of simple updating as opposed to systematic refitting in response to performance drift. The complexity of recommended updating methods reflected sample size and magnitude of performance drift, as anticipated. The case study highlights the conservative nature of our test.

Conclusions: This new test supports data-driven updating of models developed with both biostatistical and machine learning approaches, promoting the transportability and maintenance of a wide array of clinical prediction models and, in turn, a variety of applications relying on modern prediction tools.

Keywords: calibration; model updating; predictive analytics.

MeSH terms

  • Humans
  • Machine Learning
  • Models, Statistical*
  • Prognosis
  • Risk Assessment / methods*
  • Risk Assessment / statistics & numerical data
  • Statistics, Nonparametric*