Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

Thilo Hagendorff; Sarah Fabi; Michal Kosinski

doi:10.1038/s43588-023-00527-x

Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

Nat Comput Sci. 2023 Oct;3(10):833-838. doi: 10.1038/s43588-023-00527-x. Epub 2023 Oct 5.

Authors

Thilo Hagendorff¹, Sarah Fabi², Michal Kosinski³

Affiliations

¹ University of Stuttgart, Stuttgart, Germany.
² University of California San Diego, San Diego, CA, USA.
³ Stanford University, Stanford, CA, USA. michalk@stanford.edu.

Abstract

We design a battery of semantic illusions and cognitive reflection tests, aimed to elicit intuitive yet erroneous responses. We administer these tasks, traditionally used to study reasoning and decision-making in humans, to OpenAI's generative pre-trained transformer model family. The results show that as the models expand in size and linguistic proficiency they increasingly display human-like intuitive system 1 thinking and associated cognitive errors. This pattern shifts notably with the introduction of ChatGPT models, which tend to respond correctly, avoiding the traps embedded in the tasks. Both ChatGPT-3.5 and 4 utilize the input-output context window to engage in chain-of-thought reasoning, reminiscent of how people use notepads to support their system 2 thinking. Yet, they remain accurate even when prevented from engaging in chain-of-thought reasoning, indicating that their system-1-like next-word generation processes are more accurate than those of older models. Our findings highlight the value of applying psychological methodologies to study large language models, as this can uncover previously undetected emergent characteristics.

MeSH terms

Bias
Humans
Intuition*
Language
Linguistics
Problem Solving*

Grants and funding

Az. 33-7533-9-19/54/5/Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg (Ministry of Science, Research and Art Baden-Württemberg)