Algorithmic transparency and interpretability measures improve radiologists' performance in BI-RADS 4 classification

Eur Radiol. 2023 Mar;33(3):1844-1851. doi: 10.1007/s00330-022-09165-9. Epub 2022 Oct 25.

Abstract

Objective: To evaluate the perception of different types of AI-based assistance and the interaction of radiologists with the algorithm's predictions and certainty measures.

Methods: In this retrospective observer study, four radiologists were asked to classify Breast Imaging-Reporting and Data System 4 (BI-RADS4) lesions (n = 101 benign, n = 99 malignant). The effect of different types of AI-based assistance (occlusion-based interpretability map, classification, and certainty) on the radiologists' performance (sensitivity, specificity, questionnaire) were measured. The influence of the Big Five personality traits was analyzed using the Pearson correlation.

Results: Diagnostic accuracy was significantly improved by AI-based assistance (an increase of 2.8% ± 2.3%, 95 %-CI 1.5 to 4.0 %, p = 0.045) and trust in the algorithm was generated primarily by the certainty of the prediction (100% of participants). Different human-AI interactions were observed ranging from nearly no interaction to humanization of the algorithm. High scores in neuroticism were correlated with higher persuasibility (Pearson's r = 0.98, p = 0.02), while higher consciousness and change of accuracy showed an inverse correlation (Pearson's r = -0.96, p = 0.04).

Conclusion: Trust in the algorithm's performance was mostly dependent on the certainty of the predictions in combination with a plausible heatmap. Human-AI interaction varied widely and was influenced by personality traits.

Key points: • AI-based assistance significantly improved the diagnostic accuracy of radiologists in classifying BI-RADS 4 mammography lesions. • Trust in the algorithm's performance was mostly dependent on the certainty of the prediction in combination with a reasonable heatmap. • Personality traits seem to influence human-AI collaboration. Radiologists with specific personality traits were more likely to change their classification according to the algorithm's prediction than others.

Keywords: Algorithms; Artificial intelligence; Perception; Radiologists; Trust.

MeSH terms

  • Algorithms
  • Breast Neoplasms* / diagnostic imaging
  • Female
  • Humans
  • Mammography
  • Radiologists
  • Retrospective Studies
  • Vascular Diseases*