Human-Written vs AI-Generated Texts in Orthopedic Academic Literature: Comparative Qualitative Analysis

JMIR Form Res. 2024 Feb 16:8:e52164. doi: 10.2196/52164.

Abstract

Background: As large language models (LLMs) are becoming increasingly integrated into different aspects of health care, questions about the implications for medical academic literature have begun to emerge. Key aspects such as authenticity in academic writing are at stake with artificial intelligence (AI) generating highly linguistically accurate and grammatically sound texts.

Objective: The objective of this study is to compare human-written with AI-generated scientific literature in orthopedics and sports medicine.

Methods: Five original abstracts were selected from the PubMed database. These abstracts were subsequently rewritten with the assistance of 2 LLMs with different degrees of proficiency. Subsequently, researchers with varying degrees of expertise and with different areas of specialization were asked to rank the abstracts according to linguistic and methodological parameters. Finally, researchers had to classify the articles as AI generated or human written.

Results: Neither the researchers nor the AI-detection software could successfully identify the AI-generated texts. Furthermore, the criteria previously suggested in the literature did not correlate with whether the researchers deemed a text to be AI generated or whether they judged the article correctly based on these parameters.

Conclusions: The primary finding of this study was that researchers were unable to distinguish between LLM-generated and human-written texts. However, due to the small sample size, it is not possible to generalize the results of this study. As is the case with any tool used in academic research, the potential to cause harm can be mitigated by relying on the transparency and integrity of the researchers. With scientific integrity at stake, further research with a similar study design should be conducted to determine the magnitude of this issue.

Keywords: AI; LLM; artificial intelligence; detection; feedback; large language model; medical database; orthopedic; orthopedic surgery; orthopedics; qualitative study; research; scientific integrity; sports medicine; study design; surgery; tool.