Conceptualizing syntactic categories as semantic categories: Unifying part-of-speech identification and semantics using co-occurrence vector averaging

Chris Westbury; Geoff Hollis

doi:10.3758/s13428-018-1118-4

Conceptualizing syntactic categories as semantic categories: Unifying part-of-speech identification and semantics using co-occurrence vector averaging

Behav Res Methods. 2019 Jun;51(3):1371-1398. doi: 10.3758/s13428-018-1118-4.

Authors

Chris Westbury¹, Geoff Hollis²

Affiliations

¹ Department of Psychology, University of Alberta, P220 Biological Sciences Building, Edmonton, Alberta, T6G 2E9, Canada. chrisw@ualberta.ca.
² Department of Psychology, University of Alberta, P220 Biological Sciences Building, Edmonton, Alberta, T6G 2E9, Canada.

PMID: 30215164
DOI: 10.3758/s13428-018-1118-4

Abstract

Co-occurrence models have been of considerable interest to psychologists because they are built on very simple functionality. This is particularly clear in the case of prediction models, such as the continuous skip-gram model introduced in Mikolov, Chen, Corrado, and Dean (2013), because these models depend on functionality closely related to the simple Rescorla-Wagner model of discriminant learning in nonhuman animals (Rescorla & Wagner, 1972), which has a rich history within psychology as a model of many animal learning processes. We replicate and extend earlier work showing that it is possible to extract accurate information about syntactic category and morphological family membership directly from patterns of word co-occurrence, and provide evidence from four experiments showing that this information predicts human reaction times and accuracy for class membership decisions.

Keywords: Co-occurrence models; Morphology; Part-of-speech tagging; Semantics; Word2vec.

MeSH terms

Concept Formation*
Decision Making
Humans
Learning
Reaction Time
Semantics
Speech