Detecting the corruption of online questionnaires by artificial intelligence

Benjamin Lebrun; Sharon Temtsin; Andrew Vonasch; Christoph Bartneck

doi:10.3389/frobt.2023.1277635

Detecting the corruption of online questionnaires by artificial intelligence

Front Robot AI. 2024 Feb 2:10:1277635. doi: 10.3389/frobt.2023.1277635. eCollection 2023.

Authors

Benjamin Lebrun¹, Sharon Temtsin², Andrew Vonasch¹, Christoph Bartneck²

Affiliations

¹ School of Psychology, Speech, and Hearing, University of Canterbury, Christchurch, New Zealand.
² Department of Computer Science and Software Engineering, University of Canterbury, Christchurch, New Zealand.

Abstract

Online questionnaires that use crowdsourcing platforms to recruit participants have become commonplace, due to their ease of use and low costs. Artificial intelligence (AI)-based large language models (LLMs) have made it easy for bad actors to automatically fill in online forms, including generating meaningful text for open-ended tasks. These technological advances threaten the data quality for studies that use online questionnaires. This study tested whether text generated by an AI for the purpose of an online study can be detected by both humans and automatic AI detection systems. While humans were able to correctly identify the authorship of such text above chance level (76% accuracy), their performance was still below what would be required to ensure satisfactory data quality. Researchers currently have to rely on a lack of interest among bad actors to successfully use open-ended responses as a useful tool for ensuring data quality. Automatic AI detection systems are currently completely unusable. If AI submissions of responses become too prevalent, then the costs associated with detecting fraudulent submissions will outweigh the benefits of online questionnaires. Individual attention checks will no longer be a sufficient tool to ensure good data quality. This problem can only be systematically addressed by crowdsourcing platforms. They cannot rely on automatic AI detection systems and it is unclear how they can ensure data quality for their paying clients.

Keywords: AI; data quality; detection; imitation game; large language models; online questionnaires; reliability.

Grants and funding

The authors declare that no financial support was received for the research, authorship, and/or publication of this article.