Abstract
Psychologists collect similarity data to study a variety of phenomena including categorization, generalization and discrimination, and representation itself. However, collecting similarity judgments between all pairs of items in a set is expensive, spurring development of techniques like the Spatial Arrangement Method (SpAM; Goldstone, Behavior Research Methods, Instruments, & Computers, 26, 381–386, 1994), wherein participants place items on a two-dimensional plane such that proximity reflects perceived similarity. While SpAM greatly hastens similarity measurement, and has been successfully used for lower-dimensional, perceptual stimuli, its suitability for higher-dimensional, conceptual stimuli is less understood. In study 1, we evaluated the ability of SpAM to capture the semantic structure of eight different categories composed of 20–30 words each. First, SpAM distances correlated strongly (r = .71) with pairwise similarity judgments, although below SpAM and pairwise judgment split-half reliabilities (r’s > .9). Second, a cross-validation exercise with multidimensional scaling fits at increasing latent dimensionalities suggested that aggregated SpAM data favored higher (> 2) dimensional solutions for seven of the eight categories explored here. Third, split-half reliability of SpAM dissimilarities was high (Pearson r = .90), while the average correlation between pairs of participants was low (r = .15), suggesting that when different participants focus on different pairs of stimulus dimensions, reliable high-dimensional aggregate similarity data is recoverable. In study 2, we show that SpAM can recover the Big Five factor space of personality trait adjectives, and that cross-validation favors a four- or five-dimension solution on this dataset. We conclude that SpAM is an accurate and reliable method of measuring similarity for high-dimensional items like words. We publicly release our data for researchers.