The spatial arrangement method of measuring similarity can capture high-dimensional semantic structures

Behav Res Methods. 2020 Oct;52(5):1906-1928. doi: 10.3758/s13428-020-01362-y.

Abstract

Psychologists collect similarity data to study a variety of phenomena including categorization, generalization and discrimination, and representation itself. However, collecting similarity judgments between all pairs of items in a set is expensive, spurring development of techniques like the Spatial Arrangement Method (SpAM; Goldstone, Behavior Research Methods, Instruments, & Computers, 26, 381-386, 1994), wherein participants place items on a two-dimensional plane such that proximity reflects perceived similarity. While SpAM greatly hastens similarity measurement, and has been successfully used for lower-dimensional, perceptual stimuli, its suitability for higher-dimensional, conceptual stimuli is less understood. In study 1, we evaluated the ability of SpAM to capture the semantic structure of eight different categories composed of 20-30 words each. First, SpAM distances correlated strongly (r = .71) with pairwise similarity judgments, although below SpAM and pairwise judgment split-half reliabilities (r's > .9). Second, a cross-validation exercise with multidimensional scaling fits at increasing latent dimensionalities suggested that aggregated SpAM data favored higher (> 2) dimensional solutions for seven of the eight categories explored here. Third, split-half reliability of SpAM dissimilarities was high (Pearson r = .90), while the average correlation between pairs of participants was low (r = .15), suggesting that when different participants focus on different pairs of stimulus dimensions, reliable high-dimensional aggregate similarity data is recoverable. In study 2, we show that SpAM can recover the Big Five factor space of personality trait adjectives, and that cross-validation favors a four- or five-dimension solution on this dataset. We conclude that SpAM is an accurate and reliable method of measuring similarity for high-dimensional items like words. We publicly release our data for researchers.

Keywords: Big Five; concepts; multidimensional scaling; similarity; spatial cognition; traits.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Humans
  • Judgment*
  • Reproducibility of Results
  • Research Design
  • Semantics*