Appropriateness and Comprehensiveness of Using ChatGPT for Perioperative Patient Education in Thoracic Surgery in Different Language Contexts: Survey Study

Interact J Med Res. 2023 Aug 14:12:e46900. doi: 10.2196/46900.

Abstract

Background: ChatGPT, a dialogue-based artificial intelligence language model, has shown promise in assisting clinical workflows and patient-clinician communication. However, there is a lack of feasibility assessments regarding its use for perioperative patient education in thoracic surgery.

Objective: This study aimed to assess the appropriateness and comprehensiveness of using ChatGPT for perioperative patient education in thoracic surgery in both English and Chinese contexts.

Methods: This pilot study was conducted in February 2023. A total of 37 questions focused on perioperative patient education in thoracic surgery were created based on guidelines and clinical experience. Two sets of inquiries were made to ChatGPT for each question, one in English and the other in Chinese. The responses generated by ChatGPT were evaluated separately by experienced thoracic surgical clinicians for appropriateness and comprehensiveness based on a hypothetical draft response to a patient's question on the electronic information platform. For a response to be qualified, it required at least 80% of reviewers to deem it appropriate and 50% to deem it comprehensive. Statistical analyses were performed using the unpaired chi-square test or Fisher exact test, with a significance level set at P<.05.

Results: The set of 37 commonly asked questions covered topics such as disease information, diagnostic procedures, perioperative complications, treatment measures, disease prevention, and perioperative care considerations. In both the English and Chinese contexts, 34 (92%) out of 37 responses were qualified in terms of both appropriateness and comprehensiveness. The remaining 3 (8%) responses were unqualified in these 2 contexts. The unqualified responses primarily involved the diagnosis of disease symptoms and surgical-related complications symptoms. The reasons for determining the responses as unqualified were similar in both contexts. There was no statistically significant difference (34/37, 92% vs 34/37, 92%; P=.99) in the qualification rate between the 2 language sets.

Conclusions: This pilot study demonstrates the potential feasibility of using ChatGPT for perioperative patient education in thoracic surgery in both English and Chinese contexts. ChatGPT is expected to enhance patient satisfaction, reduce anxiety, and improve compliance during the perioperative period. In the future, there will be remarkable potential application for using artificial intelligence, in conjunction with human review, for patient education and health consultation after patients have provided their informed consent.

Keywords: AI; ChatGPT; Generative Pre-trained Transformer; artificial intelligence; clinical workflow; communication; education; evaluation; feasibility; language; language model; patient; patient education; surgery; thoracic; thoracic surgery; workflow.