Heart-to-heart with ChatGPT: the impact of patients consulting AI for cardiovascular health advice

Anton Danholt Lautrup; Tobias Hyrup; Anna Schneider-Kamp; Marie Dahl; Jes Sanddal Lindholt; Peter Schneider-Kamp

doi:10.1136/openhrt-2023-002455

Heart-to-heart with ChatGPT: the impact of patients consulting AI for cardiovascular health advice

Open Heart. 2023 Nov;10(2):e002455. doi: 10.1136/openhrt-2023-002455.

Authors

Anton Danholt Lautrup¹, Tobias Hyrup¹, Anna Schneider-Kamp², Marie Dahl³, Jes Sanddal Lindholt⁴, Peter Schneider-Kamp⁵

Affiliations

¹ Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
² Department of Business and Management, University of Southern Denmark Faculty of Business and Social Sciences, Odense, Denmark.
³ Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
⁴ Department of Clinical Research, University of Southern Denmark, Odense, Denmark.
⁵ Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark petersk@sdu.dk.

Abstract

Objectives: The advent of conversational artificial intelligence (AI) systems employing large language models such as ChatGPT has sparked public, professional and academic debates on the capabilities of such technologies. This mixed-methods study sets out to review and systematically explore the capabilities of ChatGPT to adequately provide health advice to patients when prompted regarding four topics from the field of cardiovascular diseases.

Methods: As of 30 May 2023, 528 items on PubMed contained the term ChatGPT in their title and/or abstract, with 258 being classified as journal articles and included in our thematic state-of-the-art review. For the experimental part, we systematically developed and assessed 123 prompts across the four topics based on three classes of users and two languages. Medical and communications experts scored ChatGPT's responses according to the 4Cs of language model evaluation proposed in this article: correct, concise, comprehensive and comprehensible.

Results: The articles reviewed were fairly evenly distributed across discussing how ChatGPT could be used for medical publishing, in clinical practice and for education of medical personnel and/or patients. Quantitatively and qualitatively assessing the capability of ChatGPT on the 123 prompts demonstrated that, while the responses generally received above-average scores, they occupy a spectrum from the concise and correct via the absurd to what only can be described as hazardously incorrect and incomplete. Prompts formulated at higher levels of health literacy generally yielded higher-quality answers. Counterintuitively, responses in a lower-resource language were often of higher quality.

Conclusions: The results emphasise the relationship between prompt and response quality and hint at potentially concerning futures in personalised medicine. The widespread use of large language models for health advice might amplify existing health inequalities and will increase the pressure on healthcare systems by providing easy access to many seemingly likely differential diagnoses and recommendations for seeing a doctor for even harmless ailments.

Keywords: computer simulation; myocardial infarction; systematic reviews as topic.

Publication types

Review

MeSH terms

Artificial Intelligence*
Cardiovascular Diseases* / diagnosis
Cardiovascular Diseases* / therapy
Heart
Humans
Patients
Referral and Consultation