Interobserver variability in quality assessment of magnetic resonance images

BMC Med Imaging. 2020 Sep 22;20(1):109. doi: 10.1186/s12880-020-00505-z.

Abstract

Background: The perceptual quality of magnetic resonance (MR) images influences diagnosis and may compromise the treatment. The purpose of this study was to evaluate how the image quality changes influence the interobserver variability of their assessment.

Methods: For the variability evaluation, a dataset containing distorted MRI images was prepared and then assessed by 31 experienced medical professionals (radiologists). Differences between observers were analyzed using the Fleiss' kappa. However, since the kappa evaluates the agreement among radiologists taking into account aggregated decisions, a typically employed criterion of the image quality assessment (IQA) performance was used to provide a more thorough analysis. The IQA performance of radiologists was evaluated by comparing the Spearman correlation coefficients, ρ, between individual scores with the mean opinion scores (MOS) composed of the subjective opinions of the remaining professionals.

Results: The experiments show that there is a significant agreement among radiologists (κ=0.12; 95% confidence interval [CI]: 0.118, 0.121; P<0.001) on the quality of the assessed images. The resulted κ is strongly affected by the subjectivity of the assigned scores, separately presenting close scores. Therefore, the ρ was used to identify poor performance cases and to confirm the consistency of the majority of collected scores (ρmean = 0.5706). The results for interns (ρmean = 0.6868) supports the finding that the quality assessment of MR images can be successfully taught.

Conclusions: The agreement observed among radiologists from different imaging centers confirms the subjectivity of the perception of MR images. It was shown that the image content and severity of distortions affect the IQA. Furthermore, the study highlights the importance of the psychosomatic condition of the observers and their attitude.

Keywords: Decision process; Fleiss’ kappa; Quality perception; Radiologists.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Diagnostic Imaging / standards*
  • Female
  • Form Perception
  • Humans
  • Magnetic Resonance Imaging / standards*
  • Male
  • Middle Aged
  • Observer Variation
  • Radiographic Image Enhancement