Metrics reloaded: recommendations for image analysis validation

Lena Maier-Hein; Annika Reinke; Patrick Godau; Minu D Tizabi; Florian Buettner; Evangelia Christodoulou; Ben Glocker; Fabian Isensee; Jens Kleesiek; Michal Kozubek; Mauricio Reyes; Michael A Riegler; Manuel Wiesenfarth; A Emre Kavur; Carole H Sudre; Michael Baumgartner; Matthias Eisenmann; Doreen Heckmann-Nötzel; Tim Rädsch; Laura Acion; Michela Antonelli; Tal Arbel; Spyridon Bakas; Arriel Benis; Matthew B Blaschko; M Jorge Cardoso; Veronika Cheplygina; Beth A Cimini; Gary S Collins; Keyvan Farahani; Luciana Ferrer; Adrian Galdran; Bram van Ginneken; Robert Haase; Daniel A Hashimoto; Michael M Hoffman; Merel Huisman; Pierre Jannin; Charles E Kahn; Dagmar Kainmueller; Bernhard Kainz; Alexandros Karargyris; Alan Karthikesalingam; Florian Kofler; Annette Kopp-Schneider; Anna Kreshuk; Tahsin Kurc; Bennett A Landman; Geert Litjens; Amin Madani; Klaus Maier-Hein; Anne L Martel; Peter Mattson; Erik Meijering; Bjoern Menze; Karel G M Moons; Henning Müller; Brennan Nichyporuk; Felix Nickel; Jens Petersen; Nasir Rajpoot; Nicola Rieke; Julio Saez-Rodriguez; Clara I Sánchez; Shravya Shetty; Maarten van Smeden; Ronald M Summers; Abdel A Taha; Aleksei Tiulpin; Sotirios A Tsaftaris; Ben Van Calster; Gaël Varoquaux; Paul F Jäger

doi:10.1038/s41592-023-02151-z

Metrics reloaded: recommendations for image analysis validation

Nat Methods. 2024 Feb;21(2):195-212. doi: 10.1038/s41592-023-02151-z. Epub 2024 Feb 12.

Authors

Lena Maier-Hein^#^{1

2

3

4

5}, Annika Reinke^#^{6

7

8}, Patrick Godau^{9

10

11}, Minu D Tizabi^{9

11}, Florian Buettner^{12

13

14

15

16}, Evangelia Christodoulou⁹, Ben Glocker¹⁷, Fabian Isensee^{18

19}, Jens Kleesiek²⁰, Michal Kozubek²¹, Mauricio Reyes^{22

23}, Michael A Riegler^{24

25}, Manuel Wiesenfarth²⁶, A Emre Kavur^{9

18

19}, Carole H Sudre^{27

28}, Michael Baumgartner¹⁸, Matthias Eisenmann⁹, Doreen Heckmann-Nötzel^{9

11}, Tim Rädsch^{9

29}, Laura Acion³⁰, Michela Antonelli^{28

31}, Tal Arbel³², Spyridon Bakas^{33

34}, Arriel Benis^{35

36}, Matthew B Blaschko³⁷, M Jorge Cardoso²⁸, Veronika Cheplygina³⁸, Beth A Cimini³⁹, Gary S Collins⁴⁰, Keyvan Farahani⁴¹, Luciana Ferrer⁴², Adrian Galdran^{43

44}, Bram van Ginneken^{45

46}, Robert Haase^{47

48

49}, Daniel A Hashimoto^{50

51}, Michael M Hoffman^{52

53

54

55}, Merel Huisman⁵⁶, Pierre Jannin^{57

58}, Charles E Kahn⁵⁹, Dagmar Kainmueller^{60

61}, Bernhard Kainz^{62

63}, Alexandros Karargyris⁶⁴, Alan Karthikesalingam⁶⁵, Florian Kofler⁶⁶, Annette Kopp-Schneider²⁶, Anna Kreshuk⁶⁷, Tahsin Kurc⁶⁸, Bennett A Landman⁶⁹, Geert Litjens⁷⁰, Amin Madani⁷¹, Klaus Maier-Hein^{18

72}, Anne L Martel^{53

55

73}, Peter Mattson⁷⁴, Erik Meijering⁷⁵, Bjoern Menze⁷⁶, Karel G M Moons⁷⁷, Henning Müller^{78

79}, Brennan Nichyporuk⁸⁰, Felix Nickel⁸¹, Jens Petersen¹⁸, Nasir Rajpoot⁸², Nicola Rieke⁸³, Julio Saez-Rodriguez^{84

85}, Clara I Sánchez⁸⁶, Shravya Shetty⁸⁷, Maarten van Smeden⁷⁷, Ronald M Summers⁸⁸, Abdel A Taha⁸⁹, Aleksei Tiulpin^{90

91}, Sotirios A Tsaftaris⁹², Ben Van Calster^{93

94}, Gaël Varoquaux⁹⁵, Paul F Jäger^{96

97}

Affiliations

¹ German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany. l.maier-hein@dkfz-heidelberg.de.
² German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany. l.maier-hein@dkfz-heidelberg.de.
³ Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany. l.maier-hein@dkfz-heidelberg.de.
⁴ Medical Faculty, Heidelberg University, Heidelberg, Germany. l.maier-hein@dkfz-heidelberg.de.
⁵ National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany. l.maier-hein@dkfz-heidelberg.de.
⁶ German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany. a.reinke@dkfz-heidelberg.de.
⁷ German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany. a.reinke@dkfz-heidelberg.de.
⁸ Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany. a.reinke@dkfz-heidelberg.de.
⁹ German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
¹⁰ Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany.
¹¹ National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany.
¹² German Cancer Consortium (DKTK), partner site Frankfurt/Mainz, a partnership between DKFZ and UCT Frankfurt-Marburg, Frankfurt am Main, Germany.
¹³ German Cancer Research Center (DKFZ) Heidelberg, Heidelberg, Germany.
¹⁴ Department of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany.
¹⁵ Department of Informatics, Goethe University Frankfurt, Frankfurt am Main, Germany.
¹⁶ Frankfurt Cancer Insititute, Frankfurt am Main, Germany.
¹⁷ Department of Computing, Imperial College London, South Kensington Campus, London, UK.
¹⁸ German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany.
¹⁹ German Cancer Research Center (DKFZ) Heidelberg, HI Applied Computer Vision Lab, Heidelberg, Germany.
²⁰ Institute for AI in Medicine, University Medicine Essen, Essen, Germany.
²¹ Centre for Biomedical Image Analysis and Faculty of Informatics, Masaryk University, Brno, Czech Republic.
²² ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland.
²³ Department of Radiation Oncology, University Hospital Bern, University of Bern, Bern, Switzerland.
²⁴ Simula Metropolitan Center for Digital Engineering, Oslo, Norway.
²⁵ Department of Computer Science, UiT The Arctic University of Norway, Tromsø, Norway.
²⁶ German Cancer Research Center (DKFZ) Heidelberg, Division of Biostatistics, Heidelberg, Germany.
²⁷ MRC Unit for Lifelong Health and Ageing at UCL and Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK.
²⁸ School of Biomedical Engineering and Imaging Science, King's College London, London, UK.
²⁹ German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
³⁰ Instituto de Cálculo, CONICET - Universidad de Buenos Aires, Buenos Aires, Argentina.
³¹ Centre for Medical Image Computing, University College London, London, UK.
³² Centre for Intelligent Machines and MILA (Québec Artificial Intelligence Institute), McGill University, Montréal, Quebec, Canada.
³³ Division of Computational Pathology, Department of Pathology & Laboratory Medicine, Indiana University School of Medicine, IU Health Information and Translational Sciences Building, Indianapolis, IN, USA.
³⁴ Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, USA.
³⁵ Department of Digital Medical Technologies, Holon Institute of Technology, Holon, Israel.
³⁶ European Federation for Medical Informatics, Le Mont-sur-Lausanne, Switzerland.
³⁷ Center for Processing Speech and Images, Department of Electrical Engineering, KU Leuven, Leuven, Belgium.
³⁸ Department of Computer Science, IT University of Copenhagen, Copenhagen, Denmark.
³⁹ Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
⁴⁰ Centre for Statistics in Medicine, University of Oxford, Nuffield Orthopaedic Centre, Oxford, UK.
⁴¹ Center for Biomedical Informatics and Information Technology, National Cancer Institute, Bethesda, MD, USA.
⁴² Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-UBA, Ciudad Autónoma de Buenos Aires, Buenos Aires, Argentina.
⁴³ BCN Medtech, Universitat Pompeu Fabra, Barcelona, Spain.
⁴⁴ Australian Institute for Machine Learning AIML, University of Adelaide, Adelaide, South Australia, Australia.
⁴⁵ Fraunhofer MEVIS, Bremen, Germany.
⁴⁶ Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, the Netherlands.
⁴⁷ Technische Universität (TU) Dresden, DFG Cluster of Excellence 'Physics of Life', Dresden, Germany.
⁴⁸ Center for Systems Biology, Dresden, Germany.
⁴⁹ Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Leipzig University, Leipzig, Germany.
⁵⁰ Department of Surgery, Perelman School of Medicine, Philadelphia, PA, USA.
⁵¹ General Robotics Automation Sensing and Perception Laboratory, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
⁵² Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.
⁵³ Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
⁵⁴ Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.
⁵⁵ Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
⁵⁶ Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, the Netherlands.
⁵⁷ Laboratoire Traitement du Signal et de l'Image - UMR_S 1099, Université de Rennes 1, Rennes, France.
⁵⁸ INSERM, Paris, France.
⁵⁹ Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA.
⁶⁰ Max-Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Biomedical Image Analysis and HI Helmholtz Imaging, Berlin, Germany.
⁶¹ Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
⁶² Department of Computing, Faculty of Engineering, Imperial College London, London, UK.
⁶³ Department AIBE, Friedrich-Alexander-Universität (FAU), Erlangen-Nürnberg, Germany.
⁶⁴ IHU Strasbourg, Strasbourg, France.
⁶⁵ Google Health DeepMind, London, UK.
⁶⁶ Helmholtz AI, Oberschleißheim, Germany.
⁶⁷ Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
⁶⁸ Department of Biomedical Informatics, Stony Brook University, Health Science Center, Stony Brook, NY, USA.
⁶⁹ Electrical Engineering, Vanderbilt University, Nashville, TN, USA.
⁷⁰ Department of Pathology, Radboud University Medical Center, Nijmegen, the Netherlands.
⁷¹ Department of Surgery, University Health Network, Philadelphia, PA, USA.
⁷² Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany.
⁷³ Physical Sciences, Sunnybrook Research Institute, Toronto, Ontario, Canada.
⁷⁴ Google, 1600 Amphitheatre Pkwy, Mountain View, CA, USA.
⁷⁵ School of Computer Science and Engineering, University of New South Wales, UNSW Sydney, Kensington, New South Wales, Australia.
⁷⁶ Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland.
⁷⁷ Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, the Netherlands.
⁷⁸ Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO), Sierre, Switzerland.
⁷⁹ Medical Faculty, University of Geneva, Geneva, Switzerland.
⁸⁰ MILA (Québec Artificial Intelligence Institute), Montréal, Quebec, Canada.
⁸¹ Department of General, Visceral and Thoracic Surgery, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
⁸² Tissue Image Analytics Laboratory, Department of Computer Science, University of Warwick, Coventry, UK.
⁸³ NVIDIA, München, Germany.
⁸⁴ Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany.
⁸⁵ Faculty of Medicine, Heidelberg University Hospital, Heidelberg, Germany.
⁸⁶ Informatics Institute, Faculty of Science, University of Amsterdam, Amsterdam, the Netherlands.
⁸⁷ Google Health, Google, Palo Alto, CA, USA.
⁸⁸ National Institutes of Health Clinical Center, Bethesda, MD, USA.
⁸⁹ Institute of Information Systems Engineering, TU Wien, Vienna, Austria.
⁹⁰ Research Unit of Health Sciences and Technology, Faculty of Medicine, University of Oulu, Oulu, Finland.
⁹¹ Neurocenter Oulu, Oulu University Hospital, Oulu, Finland.
⁹² School of Engineering, The University of Edinburgh, Edinburgh, Scotland.
⁹³ Department of Development and Regeneration and EPI-centre, KU Leuven, Leuven, Belgium.
⁹⁴ Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands.
⁹⁵ Parietal project team, INRIA Saclay-Île de France, Palaiseau, France.
⁹⁶ German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany. p.jaeger@dkfz-heidelberg.de.
⁹⁷ German Cancer Research Center (DKFZ) Heidelberg, Interactive Machine Learning Group, Heidelberg, Germany. p.jaeger@dkfz-heidelberg.de.

^# Contributed equally.

PMID: 38347141
DOI: 10.1038/s41592-023-02151-z

Abstract

Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.

Publication types

Review

MeSH terms

Algorithms*
Image Processing, Computer-Assisted*
Machine Learning
Semantics

Grants and funding

P41 GM135019/GM/NIGMS NIH HHS/United States