Introduction to Clinical Prediction Models

Ann Clin Epidemiol. 2022 Jul 1;4(3):72-80. doi: 10.37737/ace.22010. eCollection 2022.

Abstract

Clinical prediction models include a diagnostic prediction model to estimate the probability of an individual currently having a disease (e.g., pulmonary embolism) and a prognostic prediction model to estimate the probability of an individual developing a specific health outcome over a specific time period (e.g., myocardial infarction and stroke in 10 years). Clinical prediction models can be developed by applying traditional regression models (e.g., logistic and Cox regression models) or emerging machine learning models to real-world data, such as electronic health records and administrative claims data. For derivation, researchers select candidate variables based on a literature review and clinical knowledge, and predictor variables in the final model based on pre-defined criteria (e.g., thresholds for the size of relative risk and p-values) or strategies such as the stepwise regression and the least absolute shrinkage and selection operator (LASSO) regression. For validation, the clinical prediction model's performance is evaluated in terms of goodness of fit (e.g., R2), discrimination (e.g., area under the receiver operating characteristic curve or c-statistics), and calibration (e.g., calibration plot and Hosmer-Lemeshow test). Performance of a new variable added to an existing clinical prediction model is evaluated in terms of reclassification (e.g., net reclassification improvement and integrated discrimination improvement). The model should be validated using the original data to examine internal validity through methods such as resampling (e.g., cross-validation and bootstrapping) and using other participants' data to examine external validity. For successful implementation of a clinical prediction model in actual clinical practice, presentation methods such as paper-based (nomogram) or web-based calculator and an easy-to-use risk score should be considered.

Keywords: derivation; machine learning; regression; risk score; validation.