Collaboration between explainable artificial intelligence and pulmonologists improves the accuracy of pulmonary function test interpretation

Nilakash Das; Sofie Happaerts; Iwein Gyselinck; Michael Staes; Eric Derom; Guy Brusselle; Felip Burgos; Marco Contoli; Anh Tuan Dinh-Xuan; Frits M E Franssen; Sherif Gonem; Neil Greening; Christel Haenebalcke; William D-C Man; Jorge Moisés; Rudi Peché; Vitalii Poberezhets; Jennifer K Quint; Michael C Steiner; Eef Vanderhelst; Mustafa Abdo; Marko Topalovic; Wim Janssens

doi:10.1183/13993003.01720-2022

Collaboration between explainable artificial intelligence and pulmonologists improves the accuracy of pulmonary function test interpretation

Eur Respir J. 2023 May 18;61(5):2201720. doi: 10.1183/13993003.01720-2022. Print 2023 May.

Authors

Nilakash Das¹, Sofie Happaerts², Iwein Gyselinck^{1

2}, Michael Staes^{1

2}, Eric Derom³, Guy Brusselle³, Felip Burgos⁴, Marco Contoli⁵, Anh Tuan Dinh-Xuan⁶, Frits M E Franssen⁷, Sherif Gonem⁸, Neil Greening⁹, Christel Haenebalcke¹⁰, William D-C Man^{11

12}, Jorge Moisés¹³, Rudi Peché¹⁴, Vitalii Poberezhets¹⁵, Jennifer K Quint^{11

12}, Michael C Steiner⁹, Eef Vanderhelst¹⁶, Mustafa Abdo¹⁷, Marko Topalovic¹⁸, Wim Janssens^{19

2}

Affiliations

¹ Laboratory of Respiratory Diseases and Thoracic Surgery, Department of Chronic Diseases Metabolism and Ageing, KU Leuven, Leuven, Belgium.
² Clinical Department of Respiratory Diseases, University Hospitals Leuven, Leuven, Belgium.
³ UZ Gent, University of Ghent, Ghent, Belgium.
⁴ Department of Pulmonary Medicine, Hospital Clinic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, Barcelona, Spain.
⁵ Department of Translational Medicine, University of Ferrara, Ferrara, Italy.
⁶ Service de Physiologie-Explorations Fonctionnelles, AP-HP, Hôpital Cochin, Université Paris Cité, Paris, France.
⁷ Department of Respiratory Medicine and School of Nutrition and Translational Research in Metabolism (NUTRIM), Maastricht University Medical Center, Maastricht, The Netherlands.
⁸ Nottingham University Hospitals NHS Trust, Nottingham, UK.
⁹ Leicester NIHR Biomedical Research Centre - Respiratory, Department of Respiratory Sciences, University of Leicester, Leicester, UK.
¹⁰ AZ Sint-Jan Brugge-Oostende, Bruges, Belgium.
¹¹ National Heart and Lung Institute, Imperial College London, London, UK.
¹² Royal Brompton and Harefield Clinical Group, Guy's and St Thomas' NHS Foundation Trust, London, UK.
¹³ Biomedical Research Networking Center on Respiratory Diseases (CIBERES), Madrid, Spain.
¹⁴ CHU Charleroi, Charleroi, Belgium.
¹⁵ Department of Propedeutics of Internal Medicine, National Pirogov Memorial Medical University, Vinnytsya, Ukraine.
¹⁶ University Hospital of Brussels, Vrije Universiteit Brussel, Brussels, Belgium.
¹⁷ LungenClinic Grosshansdorf, Grosshansdorf, Germany.
¹⁸ ArtiQ NV, Leuven, Belgium.
¹⁹ Laboratory of Respiratory Diseases and Thoracic Surgery, Department of Chronic Diseases Metabolism and Ageing, KU Leuven, Leuven, Belgium wim.janssens@uzleuven.be.

Abstract

Background: Few studies have investigated the collaborative potential between artificial intelligence (AI) and pulmonologists for diagnosing pulmonary disease. We hypothesised that the collaboration between a pulmonologist and AI with explanations (explainable AI (XAI)) is superior in diagnostic interpretation of pulmonary function tests (PFTs) than the pulmonologist without support.

Methods: The study was conducted in two phases, a monocentre study (phase 1) and a multicentre intervention study (phase 2). Each phase utilised two different sets of 24 PFT reports of patients with a clinically validated gold standard diagnosis. Each PFT was interpreted without (control) and with XAI's suggestions (intervention). Pulmonologists provided a differential diagnosis consisting of a preferential diagnosis and optionally up to three additional diagnoses. The primary end-point compared accuracy of preferential and additional diagnoses between control and intervention. Secondary end-points were the number of diagnoses in differential diagnosis, diagnostic confidence and inter-rater agreement. We also analysed how XAI influenced pulmonologists' decisions.

Results: In phase 1 (n=16 pulmonologists), mean preferential and differential diagnostic accuracy significantly increased by 10.4% and 9.4%, respectively, between control and intervention (p<0.001). Improvements were somewhat lower but highly significant (p<0.0001) in phase 2 (5.4% and 8.7%, respectively; n=62 pulmonologists). In both phases, the number of diagnoses in the differential diagnosis did not reduce, but diagnostic confidence and inter-rater agreement significantly increased during intervention. Pulmonologists updated their decisions with XAI's feedback and consistently improved their baseline performance if AI provided correct predictions.

Conclusion: A collaboration between a pulmonologist and XAI is better at interpreting PFTs than individual pulmonologists reading without XAI support or XAI alone.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Artificial Intelligence*
Humans
Lung Diseases* / diagnosis
Pulmonologists
Respiratory Function Tests