Predicting raters' transparency judgments of English and Chinese morphological constituents using latent semantic analysis

Behav Res Methods. 2014 Mar;46(1):284-306. doi: 10.3758/s13428-013-0360-z.

Abstract

The morphological constituents of English compounds (e.g., "butter" and "fly" for "butterfly") and two-character Chinese compounds may differ in meaning from the whole word. Subjective differences and ambiguity of transparency make judgments difficult, and a computational alternative based on a general model might be a way to average across subjective differences. In the present study, we propose two approaches based on latent semantic analysis (Landauer & Dumais in Psychological Review 104:211-240, 1997): Model 1 compares the semantic similarity between a compound word and each of its constituents, and Model 2 derives the dominant meaning of a constituent from a clustering analysis of morphological family members (e.g., "butterfingers" or "buttermilk" for "butter"). The proposed models successfully predicted participants' transparency ratings, and we recommend that experimenters use Model 1 for English compounds and Model 2 for Chinese compounds, on the basis of differences in raters' morphological processing in the different writing systems. The dominance of lexical meaning, semantic transparency, and the average similarity between all pairs within a morphological family are provided, and practical applications for future studies are discussed.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Area Under Curve
  • Asian People
  • Female
  • Humans
  • Judgment*
  • Language*
  • Models, Psychological*
  • Models, Statistical*
  • Predictive Value of Tests
  • Psycholinguistics / methods*
  • Psycholinguistics / statistics & numerical data
  • ROC Curve
  • Semantics*
  • Vocabulary