Can Machine-learning Algorithms Predict Early Revision TKA in the Danish Knee Arthroplasty Registry?

Anders El-Galaly; Clare Grazal; Andreas Kappel; Poul Torben Nielsen; Steen Lund Jensen; Jonathan A Forsberg

doi:10.1097/CORR.0000000000001343

Can Machine-learning Algorithms Predict Early Revision TKA in the Danish Knee Arthroplasty Registry?

Clin Orthop Relat Res. 2020 Sep;478(9):2088-2101. doi: 10.1097/CORR.0000000000001343.

Authors

Anders El-Galaly^{1

2}, Clare Grazal³, Andreas Kappel^{1

2}, Poul Torben Nielsen^{1

2}, Steen Lund Jensen^{1

2}, Jonathan A Forsberg⁴

Affiliations

¹ A. El-Galaly, A. Kappel, P. T. Nielsen, S. L. Jensen, Orthopedic Research Unit, Aalborg University Hospital, Aalborg, Denmark.
² A. El-Galaly, A. Kappel, P. T. Nielsen, S. L. Jensen, Department of Clinical Medicine, Aalborg University, Aalborg, Denmark.
³ C. Grazal, J. A. Forsberg, Uniformed Services University-Walter Reed Department of Surgery, Bethesda, MD, USA.
⁴ J. A. Forsberg, Department of Orthopaedic Surgery, Johns Hopkins University, Baltimore MD, USA.

Abstract

Background: Revision TKA is a serious adverse event with substantial consequences for the patient. As the demand for TKA rises, reducing the risk of revision TKA is becoming increasingly important. Predictive tools based on machine-learning algorithms could reform clinical practice. Few attempts have been made to combine machine-learning algorithms with data from nationwide arthroplasty registries and, to the authors' knowledge, none have tried to predict the likelihood of early revision TKA.

Question/purposes: We used the Danish Knee Arthroplasty Registry to build models to predict the likelihood of revision TKA within 2 years of primary TKA and asked: (1) Which preoperative factors were the most important features behind these models' predictions of revision? (2) Can a clinically meaningful model be built on the preoperative factors included in the Danish Knee Arthroplasty Registry?

Methods: The Danish Knee Arthroplasty Registry collects patients' characteristics and surgical information from all arthroplasties conducted in Denmark and thus provides a large nationwide cohort of patients undergoing TKA. As training dataset, we retrieved all preoperative variables of 25,104 primary TKAs from 2012 to 2015. The same variables were retrieved from 6170 TKAs conducted in 2016, which were used as a hold-out year for temporal external validation. If a patient received bilateral TKA, only the first knee to receive surgery was included. All patients were followed for 2 years, with removal, exchange, or addition of an implant defined as TKA revision. We created four different predictive models to find the best performing model, including a regression-based model using logistic regression with least shrinkage and selection operator (LASSO), two classification tree models (random forest and gradient boosting model) and a supervised neural network. For comparison, we created a noninformative model predicting that all observations were unrevised. The four machine learning models were trained using 10-fold cross-validation on the training dataset after adjusting for the low percentage of revisions by over-sampling revised observations and undersampling unrevised observations. In the validation dataset, the models' performance was evaluated and compared by density plot, calibration plot, accuracy, Brier score, receiver operator characteristic (ROC) curve and area under the curve (AUC). The density plot depicts the distribution of probabilities and the calibration plot graphically depicts whether the predicted probability resembled the observed probability. The accuracy indicates how often the models' predictions were correct and the Brier score is the mean distance from the predicted probability to the observed outcome. The ROC curve is a graphical output of the models' sensitivity and specificity from which the AUC is calculated. The AUC can be interpreted as the likelihood that a model correctly classified an observation and thus, a priori, an AUC of 0.7 was chosen as threshold for a clinically meaningful model.

Results: Based the model training, age, postfracture osteoarthritis and weight were deemed as important preoperative factors within the machine learning models. During validation, the models' performance was not different from the noninformative models, and with AUCs ranging from 0.57 to 0.60, no models reached the predetermined AUC threshold for a clinical useful discriminative capacity.

Conclusion: Although several well-known presurgical risk factors for revision were coupled with four different machine learning methods, we could not develop a clinically useful model capable of predicting early TKA revisions in the Danish Knee Arthroplasty Registry based on preoperative data.

Clinical relevance: The inability to predict early TKA revision highlights that predicting revision based on preoperative information alone is difficult. Future models might benefit from including medical comorbidities and an anonymous surgeon identifier variable or may attempt to build a postoperative predictive model including intra- and postoperative factors as these may have a stronger association with early TKA revisions.

Publication types

Evaluation Study

MeSH terms

Adult
Age Factors
Aged
Algorithms*
Arthroplasty, Replacement, Knee / adverse effects
Arthroplasty, Replacement, Knee / statistics & numerical data*
Body Weight
Denmark
Female
Humans
Knee Injuries
Machine Learning*
Male
Middle Aged
Osteoarthritis, Knee / etiology
Osteoarthritis, Knee / surgery
Postoperative Complications / etiology
Postoperative Complications / surgery
Predictive Value of Tests
Preoperative Period
Registries
Reoperation / statistics & numerical data*
Risk Assessment / methods*
Risk Factors