Inter-reader Variability in the Use of BI-RADS Descriptors for Suspicious Findings on Diagnostic Mammography: A Multi-institution Study of 10 Academic Radiologists

Acad Radiol. 2017 Jan;24(1):60-66. doi: 10.1016/j.acra.2016.09.010. Epub 2016 Oct 25.

Abstract

Rationale and objectives: The study aimed to determine the inter-observer agreement among academic breast radiologists when using the Breast Imaging Reporting and Data System (BI-RADS) lesion descriptors for suspicious findings on diagnostic mammography.

Materials and methods: Ten experienced academic breast radiologists across five medical centers independently reviewed 250 de-identified diagnostic mammographic cases that were previously assessed as BI-RADS 4 or 5 with subsequent pathologic diagnosis by percutaneous or surgical biopsy. Each radiologist assessed the presence of the following suspicious mammographic findings: mass, asymmetry (one view), focal asymmetry (two views), architectural distortion, and calcifications. For any identified calcifications, the radiologist also described the morphology and distribution. Inter-observer agreement was determined with Fleiss kappa statistic. Agreement was also calculated by years of experience.

Results: Of the 250 lesions, 156 (62%) were benign and 94 (38%) were malignant. Agreement among the 10 readers was strongest for recognizing the presence of calcifications (k = 0.82). There was substantial agreement among the readers for the identification of a mass (k = 0.67), whereas agreement was fair for the presence of a focal asymmetry (k = 0.21) or architectural distortion (k = 0.28). Agreement for asymmetries (one view) was slight (k = 0.09). Among the categories of calcification morphology and distribution, reader agreement was moderate (k = 0.51 and k = 0.60, respectively). Readers with more experience (10 or more years in clinical practice) did not demonstrate higher levels of agreement compared to those with less experience.

Conclusions: Strength of agreement varies widely for different types of mammographic findings, even among dedicated academic breast radiologists. More subtle findings such as asymmetries and architectural distortion demonstrated the weakest agreement. Studies that seek to evaluate the predictive value of certain mammographic features for malignancy should take into consideration the inherent interpretive variability for these findings.

Keywords: BI-RADS; Breast Imaging; Mammography.

Publication types

  • Multicenter Study

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Biopsy
  • Breast / pathology*
  • Breast Neoplasms / diagnostic imaging
  • Breast Neoplasms / pathology*
  • Calcinosis / diagnostic imaging
  • Calcinosis / pathology*
  • Carcinoma, Ductal, Breast / diagnostic imaging
  • Carcinoma, Ductal, Breast / pathology*
  • Clinical Competence / standards
  • Female
  • Health Facilities
  • Humans
  • Mammography / standards*
  • Middle Aged
  • Observer Variation
  • Radiologists / standards*
  • Retrospective Studies