Adjusting for verification bias in diagnostic accuracy measures when comparing multiple screening tests - an application to the IP1-PROSTAGRAM study

Emily Day; David Eldred-Evans; A Toby Prevost; Hashim U Ahmed; Francesca Fiorentino

doi:10.1186/s12874-021-01481-w

Adjusting for verification bias in diagnostic accuracy measures when comparing multiple screening tests - an application to the IP1-PROSTAGRAM study

BMC Med Res Methodol. 2022 Mar 18;22(1):70. doi: 10.1186/s12874-021-01481-w.

Authors

Emily Day¹, David Eldred-Evans², A Toby Prevost³, Hashim U Ahmed^{2

4}, Francesca Fiorentino^{5

6

7}

Affiliations

¹ Imperial Clinical Trials Unit, Imperial College London, London, UK.
² Imperial Prostate, Division of Surgery, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK.
³ Nightingale-Saunders Unit, King's Clinical Trials Unit, King's College London, London, UK.
⁴ Imperial Urology, Imperial College Healthcare NHS Trust, London, UK.
⁵ Imperial Clinical Trials Unit, Imperial College London, London, UK. francesca.fiorentino@kcl.ac.uk.
⁶ Nightingale-Saunders Unit, King's Clinical Trials Unit, King's College London, London, UK. francesca.fiorentino@kcl.ac.uk.
⁷ Division of Surgery, Imperial College London, St Mary's Hospital, Praed Street, London, W2 1NY, UK. francesca.fiorentino@kcl.ac.uk.

Abstract

Introduction: Novel screening tests used to detect a target condition are compared against either a reference standard or other existing screening methods. However, as it is not always possible to apply the reference standard on the whole population under study, verification bias is introduced. Statistical methods exist to adjust estimates to account for this bias. We extend common methods to adjust for verification bias when multiple tests are compared to a reference standard using data from a prospective double blind screening study for prostate cancer.

Methods: Begg and Greenes method and multiple imputation are extended to include the results of multiple screening tests which determine condition verification status. These two methods are compared to the complete case analysis using the IP1-PROSTAGRAM study data. IP1-PROSTAGRAM used a paired-cohort double-blind design to evaluate the use of imaging as alternative tests to screen for prostate cancer, compared to a blood test called prostate specific antigen (PSA). Participants with positive imaging (index) and/or PSA (control) underwent a prostate biopsy (reference standard).

Results: When comparing complete case results to Begg and Greenes and methods of multiple imputation there is a statistically significant increase in the specificity estimates for all screening tests. Sensitivity estimates remained similar across the methods, with completely overlapping 95% confidence intervals. Negative predictive value (NPV) estimates were higher when adjusting for verification bias, compared to complete case analysis, even though the 95% confidence intervals overlap. Positive predictive value (PPV) estimates were similar across all methods.

Conclusion: Statistical methods are required to adjust for verification bias in accuracy estimates of screening tests. Expanding Begg and Greenes method to include multiple screening tests can be computationally intensive, hence multiple imputation is recommended, especially as it can be modified for low prevalence of the target condition.

Keywords: Begg and Greenes; Multiple imputation; Sensitivity; Specificity; Verification bias.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bias
Double-Blind Method
Humans
Male
Mass Screening*
Prospective Studies
Prostate-Specific Antigen*
Sensitivity and Specificity

Substances

Prostate-Specific Antigen

Grants and funding

204998/Z/16/Z/WT_/Wellcome Trust/United Kingdom