Overlap in meaning is a stronger predictor of semantic activation in GPT-3 than in humans

Jan Digutsch; Michal Kosinski

doi:10.1038/s41598-023-32248-6

Overlap in meaning is a stronger predictor of semantic activation in GPT-3 than in humans

Sci Rep. 2023 Mar 28;13(1):5035. doi: 10.1038/s41598-023-32248-6.

Authors

Jan Digutsch^{1

2}, Michal Kosinski³

Affiliations

¹ Leibniz Research Centre for Working Environment and Human Factors at the Technical University of Dortmund, Dortmund, Germany. jan.digutsch@unisg.ch.
² Institute of Behavioral Science and Technology, University of St. Gallen, St. Gallen, Switzerland. jan.digutsch@unisg.ch.
³ Stanford University, Stanford, CA, 94305, USA.

Abstract

Modern large language models generate texts that are virtually indistinguishable from those written by humans and achieve near-human performance in comprehension and reasoning tests. Yet, their complexity makes it difficult to explain and predict their functioning. We examined a state-of-the-art language model (GPT-3) using lexical decision tasks widely used to study the structure of semantic memory in humans. The results of four analyses showed that GPT-3's patterns of semantic activation are broadly similar to those observed in humans, showing significantly higher semantic activation in related (e.g., "lime-lemon") word pairs than in other-related (e.g., "sour-lemon") or unrelated (e.g., "tourist-lemon") word pairs. However, there are also significant differences between GPT-3 and humans. GPT-3's semantic activation is better predicted by similarity in words' meaning (i.e., semantic similarity) rather than their co-occurrence in the language (i.e., associative similarity). This suggests that GPT-3's semantic network is organized around word meaning rather than their co-occurrence in text.

Publication types

Comparative Study

MeSH terms

Comprehension* / physiology
Humans
Natural Language Processing*
Semantics*
Word Association Tests*