Trustworthy deep learning framework for the detection of abnormalities in X-ray shoulder images

PLoS One. 2024 Mar 11;19(3):e0299545. doi: 10.1371/journal.pone.0299545. eCollection 2024.

Abstract

Musculoskeletal conditions affect an estimated 1.7 billion people worldwide, causing intense pain and disability. These conditions lead to 30 million emergency room visits yearly, and the numbers are only increasing. However, diagnosing musculoskeletal issues can be challenging, especially in emergencies where quick decisions are necessary. Deep learning (DL) has shown promise in various medical applications. However, previous methods had poor performance and a lack of transparency in detecting shoulder abnormalities on X-ray images due to a lack of training data and better representation of features. This often resulted in overfitting, poor generalisation, and potential bias in decision-making. To address these issues, a new trustworthy DL framework has been proposed to detect shoulder abnormalities (such as fractures, deformities, and arthritis) using X-ray images. The framework consists of two parts: same-domain transfer learning (TL) to mitigate imageNet mismatch and feature fusion to reduce error rates and improve trust in the final result. Same-domain TL involves training pre-trained models on a large number of labelled X-ray images from various body parts and fine-tuning them on the target dataset of shoulder X-ray images. Feature fusion combines the extracted features with seven DL models to train several ML classifiers. The proposed framework achieved an excellent accuracy rate of 99.2%, F1Score of 99.2%, and Cohen's kappa of 98.5%. Furthermore, the accuracy of the results was validated using three visualisation tools, including gradient-based class activation heat map (Grad CAM), activation visualisation, and locally interpretable model-independent explanations (LIME). The proposed framework outperformed previous DL methods and three orthopaedic surgeons invited to classify the test set, who obtained an average accuracy of 79.1%. The proposed framework has proven effective and robust, improving generalisation and increasing trust in the final results.

MeSH terms

  • Arthritis*
  • Deep Learning*
  • Emergency Service, Hospital
  • Humans
  • Musculoskeletal Diseases*
  • Shoulder / diagnostic imaging
  • X-Rays

Grants and funding

The authors would like to acknowledge the support received through the following funding schemes of Australian Government: Australian Research Council (ARC) Industrial Transformation Training Centre (ITTC) for Joint Biomechanics under grant (IC190100020). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.