Bridging the theoretical gap between semantic representation models without the pressure of a ranking: some lessons learnt from LSA

Guillermo Jorge-Botana; Ricardo Olmos; José María Luzón

doi:10.1007/s10339-019-00934-x

Bridging the theoretical gap between semantic representation models without the pressure of a ranking: some lessons learnt from LSA

Cogn Process. 2020 Feb;21(1):1-21. doi: 10.1007/s10339-019-00934-x. Epub 2019 Sep 25.

Authors

Guillermo Jorge-Botana¹, Ricardo Olmos², José María Luzón³

Affiliations

¹ Universidad Nacional de Educación a Distancia, Juan del Rosal, nº 10, 28023, Madrid, Spain. gdejorge@psi.uned.es.
² Universidad Autónoma de Madrid, Ciudad Universitaria de Cantoblanco, C/Iván Pavlov, s/n., 28049, Madrid, Spain.
³ Universidad Nacional de Educación a Distancia, Juan del Rosal, nº 10, 28023, Madrid, Spain.

PMID: 31555943
DOI: 10.1007/s10339-019-00934-x

Abstract

In recent years, latent semantic analysis (LSA) has reached a level of maturity at which its presence is ubiquitous in technology as well as in simulation of cognitive processes. In spite of this, in recent years there has been a trend of subjecting LSA to some criticisms, usually because it is compared to other models in very specific tasks and conditions and sometimes without having good knowledge of what the semantic representation of LSA means, and without exploiting all the possibilities of which LSA is capable other than the cosine. This paper provides a critical review to clarify some of the misunderstandings regarding LSA and other space models. The historical stability of the predecessors of LSA, the representational structure of word meaning and the multiple topologies that could arise from a semantic space, the computation of similarity, the myth that LSA dimensions have no meaning, the computational and algorithm plausibility to account for meaning acquisition in LSA (in contrast to others models based on online mechanisms), the possibilities of spatial models to substantiate recent proposals, and, in general, the characteristics of classic vector models and their ease and flexibility to simulate some cognitive phenomena will be reviewed. The review highlights the similarity between LSA and other techniques and proposes using long LSA experiences in other models, especially in predicting models such as word2vec. In sum, it emphasizes the lessons that can be learned from comparing LSA-based models to other models, rather than making statements about "the best."

Keywords: Counting models; Distributional models; Latent semantic analysis (LSA); Lexical dynamicity; Predicting models; Spatial models; Topic model; word2vec.

Publication types

Review

MeSH terms

Algorithms
Humans
Knowledge
Learning*
Models, Theoretical
Semantics*