Radiologists' interpretive efficiency and variability in true- and false-positive detection when screen-reading with tomosynthesis (3D-mammography) relative to standard mammography in population screening

Tony M Svahn; Petra Macaskill; Nehmat Houssami

doi:10.1016/j.breast.2015.08.012

Radiologists' interpretive efficiency and variability in true- and false-positive detection when screen-reading with tomosynthesis (3D-mammography) relative to standard mammography in population screening

Breast. 2015 Dec;24(6):687-93. doi: 10.1016/j.breast.2015.08.012. Epub 2015 Oct 1.

Authors

Tony M Svahn¹, Petra Macaskill², Nehmat Houssami²

Affiliations

¹ School of Public Health, Sydney Medical School, University of Sydney, Sydney 2006, NSW, Australia. Electronic address: TonyMartinSvahn@gmail.com.
² School of Public Health, Sydney Medical School, University of Sydney, Sydney 2006, NSW, Australia.

PMID: 26433751
DOI: 10.1016/j.breast.2015.08.012

Abstract

We examined interpretive efficiency and variability in true- and false-positive detection (TP, FP) for radiologists screen-reading with digital breast tomosynthesis as adjunct to full-field digital mammography (2D/3D) relative to 2D alone in population-based screening studies. A systematic literature search was performed to identify screening studies that provided radiologist-specific data for TP and FP detection. Radiologist interpretive efficiency (trade-off between TPs and FPs) was calculated using the FP:TP ratio which expresses the number of FP recalls for each screen-detected breast cancer. We modeled a pooled FP:TP ratio to assess variability in radiologists' interpretive efficiency at study-level using random effects logistic regression. FP:TP ratio improved (ratio decreased) for 2D/3D screen-reading (relative to 2D) for a majority of radiologists (18 of 22) across all studies. Variability in radiologists' FP:TP ratio was consistently lower in all studies for 2D/3D screen-reading, as suggested by lower variance in ratios. Study-level pooled FP:TP ratio for 2D- and 2D/3D-mammography respectively, were 5.96 (95%CI: 4.08 to 8.72) and 3.17 (95%CI: 2.25 to 4.47) for the STORM trial; 10.25 (95%CI: 6.42 to 16.35) and 7.07 (95%CI: 4.99 to 10.02) for the Oslo trial; and 20.84 (95%CI: 13.95 to 31.12) and 8.37 (95%CI: 5.87 to 11.93) for the Houston study. This transfers into study-level improved interpretative efficiencies of 48%, 30% and 55%, respectively, for 2D/3D screen-reading (relative to 2D). In summary, study-level FP:TP trade-off improved using 2D/3D-mammography for all studies, which was also seen for most individual radiologists. There was variability in the FP:TP trade-off between readers and studies for 2D-as well as for 2D/3D-interpretations but variability in radiologists' interpretive efficiency was relatively lower using 2D/3D-mammography.

Keywords: Digital breast tomosynthesis; Interpretive efficiency; Mammography; Population screening; Reader variability.

Publication types

Meta-Analysis
Review

MeSH terms

Adult
Aged
Breast Neoplasms / diagnostic imaging*
Clinical Competence / statistics & numerical data
Early Detection of Cancer / methods
False Positive Reactions
Female
Humans
Imaging, Three-Dimensional / methods*
Mammography / methods*
Mass Screening / methods
Middle Aged
Radiology / methods
Radiology / statistics & numerical data*
Sensitivity and Specificity