Development and Validation of a Deep-Learning Model to Predict Total Hip Replacement on Radiographs: The Total Hip Replacement Prediction (THREP) Model

J Bone Joint Surg Am. 2024 Mar 6;106(5):389-396. doi: 10.2106/JBJS.23.00549. Epub 2023 Dec 12.

Abstract

Background: There are few methods for accurately assessing the risk of total hip arthroplasty (THA) in patients with osteoarthritis. A novel and reliable method that could play a substantial role in research and clinical routine should be investigated. The purpose of the present study was to develop a deep-learning model that can reliably predict the risk of THA with use of radiographic images and clinical symptom data.

Methods: This retrospective, multicenter, case-control study assessed hip joints on weighted-bearing anteroposterior pelvic radiographs obtained from Osteoarthritis Initiative (OAI) participants. Participants who underwent THA were matched to controls according to age, sex, body mass index, and ethnicity. Cases and controls were uniformly split into training, validation, and testing data sets at proportions of 72% (n = 528), 14% (n = 104), and 14% (n = 104), respectively. Images and clinical symptom data were passed through a detection model and a deep convolutional neural network (DCNN) model to predict the probability of THA within 9 years as well as the most likely time period for THA (0 to 2 years, 3 to 5 years, 6 to 9 years). Model performance was assessed with use of the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity in the testing set.

Results: A total of 736 participants were evaluated, including 184 cases and 552 controls. The prediction model achieved an overall accuracy, sensitivity, and specificity of 91.35%, 92.59% and 86.96%, respectively, with an AUC of 0.944, for THA within 9 years. The AUC of the DCNN model for assessing the most likely time period was 0.907 for 0 to 2 years, 0.916 for 3 to 5 years, and 0.841 for 6 to 9 years. Gradient-weighted class activation mapping closely corresponded to regions affecting the prediction of the DCNN model.

Conclusions: The proposed DCNN model is a reliable and valid method to predict the probability of THA-within limitations. It could assist clinicians in patient counseling and decision-making regarding the timing of the intervention. In the future, by increasing the size of the data set, enhancing the ethnic and socioeconomic diversity of the participants, and improving the follow-up rate, the quality of the conclusions can be further improved.

Level of evidence: Prognostic Level III . See Instructions for Authors for a complete description of levels of evidence.

Publication types

  • Multicenter Study

MeSH terms

  • Arthroplasty, Replacement, Hip*
  • Case-Control Studies
  • Deep Learning*
  • Humans
  • Osteoarthritis*
  • Retrospective Studies