Assessment of a novel deep learning-based software developed for automatic feature extraction and grading of radiographic knee osteoarthritis

BMC Musculoskelet Disord. 2023 Nov 8;24(1):869. doi: 10.1186/s12891-023-06951-4.

Abstract

Background: The Kellgren-Lawrence (KL) grading system is the most widely used method to classify the severity of osteoarthritis (OA) of the knee. However, due to ambiguity of terminology, the KL system showed inferior inter- and intra-observer reliability. For a more reliable evaluation, we recently developed novel deep learning (DL) software known as MediAI-OA to extract each radiographic feature of knee OA and to grade OA severity based on the KL system.

Methods: This research used data from the Osteoarthritis Initiative for training and validation of MediAI-OA. 44,193 radiographs and 810 radiographs were set as the training data and used as validation data, respectively. This AI model was developed to automatically quantify the degree of joint space narrowing (JSN) of medial and lateral tibiofemoral joint, to automatically detect osteophytes in four regions (medial distal femur, lateral distal femur, medial proximal tibia and lateral proximal tibia) of the knee joint, to classify the KL grade, and present the results of these three OA features together. The model was tested by using 400 test datasets, and the results were compared to the ground truth. The accuracy of the JSN quantification and osteophyte detection was evaluated. The KL grade classification performance was evaluated by precision, recall, F1 score, accuracy, and Cohen's kappa coefficient. In addition, we defined KL grade 2 or higher as clinically significant OA, and accuracy of OA diagnosis were obtained.

Results: The mean squared error of JSN rate quantification was 0.067 and average osteophyte detection accuracy of the MediAI-OA was 0.84. The accuracy of KL grading was 0.83, and the kappa coefficient between the AI model and ground truth was 0.768, which demonstrated substantial consistency. The OA diagnosis accuracy of this software was 0.92.

Conclusions: The novel DL software known as MediAI-OA demonstrated satisfactory performance comparable to that of experienced orthopedic surgeons and radiologists for analyzing features of knee OA, KL grading and OA diagnosis. Therefore, reliable KL grading can be performed and the burden of the radiologist can be reduced by using MediAI-OA.

Keywords: Artificial intelligence; Deep learning; Joint space narrowing; KL grade; Kellgren & Lawrence classification; Knee osteoarthritis.

MeSH terms

  • Deep Learning*
  • Humans
  • Osteoarthritis, Knee* / diagnostic imaging
  • Osteophyte*
  • Reproducibility of Results
  • Software