Internal and External Validation of the Generalizability of Machine Learning Algorithms in Predicting Non-home Discharge Disposition Following Primary Total Knee Joint Arthroplasty

J Arthroplasty. 2023 Oct;38(10):1973-1981. doi: 10.1016/j.arth.2023.01.065. Epub 2023 Feb 9.

Abstract

Background: Nonhome discharge disposition following primary total knee arthroplasty (TKA) is associated with a higher rate of complications and constitutes a socioeconomic burden on the health care system. While existing algorithms predicting nonhome discharge disposition varied in degrees of mathematical complexity and prediction power, their capacity to generalize predictions beyond the development dataset remains limited. Therefore, this study aimed to establish the machine learning model generalizability by performing internal and external validations using nation-scale and institutional cohorts, respectively.

Methods: Four machine learning models were trained using the national cohort. Recursive feature elimination and hyper-parameter tuning were applied. Internal validation was achieved through five-fold cross-validation during model training. The trained models' performance was externally validated using the institutional cohort and assessed by discrimination, calibration, and clinical utility.

Results: The national (424,354 patients) and institutional (10,196 patients) cohorts had non-home discharge rates of 19.4 and 36.4%, respectively. The areas under the receiver operating curve of the model predictions were 0.83 to 0.84 during internal validation and increased to 0.88 to 0.89 during external validation. Artificial neural network and histogram-based gradient boosting elicited the best performance with a mean area under the receiver operating curve of 0.89, calibration slope of 1.39, and Brier score of 0.14, which indicated that the two models were robust in distinguishing non-home discharge and well-calibrated with accurate predictions of the probabilities. The low inter-dataset similarity indicated reliable external validation. Length of stay, age, body mass index, and sex were the strongest predictors of discharge destination after primary TKA.

Conclusion: The machine learning models demonstrated excellent predictive performance during both internal and external validations, supporting their generalizability across different patient cohorts and potential applicability in the clinical workflow.

Keywords: artificial intelligence; discharge disposition; external validation; machine learning model; total knee arthroplasty.

MeSH terms

  • Algorithms
  • Arthroplasty, Replacement, Knee*
  • Humans
  • Knee Joint
  • Machine Learning
  • Patient Discharge
  • Retrospective Studies