Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons

Nancy A Obuchowski; Anthony P Reeves; Erich P Huang; Xiao-Feng Wang; Andrew J Buckler; Hyun J Grace Kim; Huiman X Barnhart; Edward F Jackson; Maryellen L Giger; Gene Pennello; Alicia Y Toledano; Jayashree Kalpathy-Cramer; Tatiyana V Apanasovich; Paul E Kinahan; Kyle J Myers; Dmitry B Goldgof; Daniel P Barboriak; Robert J Gillies; Lawrence H Schwartz; Daniel C Sullivan; Algorithm Comparison Working Group

doi:10.1177/0962280214537390

Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons

Stat Methods Med Res. 2015 Feb;24(1):68-106. doi: 10.1177/0962280214537390. Epub 2014 Jun 11.

Authors

Nancy A Obuchowski¹, Anthony P Reeves², Erich P Huang³, Xiao-Feng Wang⁴, Andrew J Buckler⁵, Hyun J Grace Kim⁶, Huiman X Barnhart⁷, Edward F Jackson⁸, Maryellen L Giger⁹, Gene Pennello¹⁰, Alicia Y Toledano¹¹, Jayashree Kalpathy-Cramer¹², Tatiyana V Apanasovich¹³, Paul E Kinahan¹⁴, Kyle J Myers¹⁰, Dmitry B Goldgof¹⁵, Daniel P Barboriak⁷, Robert J Gillies¹⁶, Lawrence H Schwartz¹⁷, Daniel C Sullivan⁷; Algorithm Comparison Working Group

Affiliations

¹ Cleveland Clinic Foundation, Cleveland, OH, USA obuchon@ccf.org.
² Cornell University, Ithaca, NY, USA.
³ National Institutes of Health, Rockville, MD, USA.
⁴ Cleveland Clinic Foundation, Cleveland, OH, USA.
⁵ Elucid Bioimaging Inc., Wenham, MA, USA.
⁶ University of California, Los Angeles, CA, USA.
⁷ Duke University, Durham, NC, USA.
⁸ University of Wisconsin-Madison, Madison, WI, USA.
⁹ University of Chicago, Chicago, IL, USA.
¹⁰ Food and Drug Administration/CDRH, Silver Spring, MD, USA.
¹¹ Biostatistics Consulting, LLC, Kensington, MD, USA.
¹² MGH/Harvard Medical School, Boston, MA, USA.
¹³ George Washington University, NW Washington, DC, USA.
¹⁴ University of Washington, Seattle, WA, USA.
¹⁵ University of South Florida, Tampa, FL, USA.
¹⁶ H. Moffitt Cancer Center, Tampa, FL, USA.
¹⁷ Columbia University, New York, NY, USA.

Abstract

Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.

Keywords: agreement; bias; image metrics; imaging biomarkers; precision; quantitative imaging; repeatability; reproducibility.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Review

MeSH terms

Algorithms*
Bias
Biomarkers*
Computer Simulation
Diagnostic Imaging*
Humans
Phantoms, Imaging
Reference Standards
Reproducibility of Results
Research Design*
Statistics as Topic*

Substances

Biomarkers

Abstract

Publication types

MeSH terms

Substances

Grants and funding