New algorithms assessing short summaries in expository texts using latent semantic analysis

Ricardo Olmos; José A León; Guillermo Jorge-Botana; Inmaculada Escudero

doi:10.3758/BRM.41.3.944

New algorithms assessing short summaries in expository texts using latent semantic analysis

Behav Res Methods. 2009 Aug;41(3):944-50. doi: 10.3758/BRM.41.3.944.

Authors

Ricardo Olmos¹, José A León, Guillermo Jorge-Botana, Inmaculada Escudero

Affiliation

¹ Facultad de Psicología, Universidad Autónoma de Madrid, Madrid, Spain.

PMID: 19587211
DOI: 10.3758/BRM.41.3.944

Abstract

In this study, we compared four expert graders with latent semantic analysis (LSA) to assess short summaries of an expository text. As is well known, there are technical difficulties for LSA to establish a good semantic representation when analyzing short texts. In order to improve the reliability of LSA relative to human graders, we analyzed three new algorithms by two holistic methods used in previous research (León, Olmos, Escudero, Cañas, & Salmerón, 2006). The three new algorithms were (1) the semantic common network algorithm, an adaptation of an algorithm proposed by W. Kintsch (2001, 2002) with respect to LSA as a dynamic model of semantic representation; (2) a best-dimension reduction measure of the latent semantic space, selecting those dimensions that best contribute to improving the LSA assessment of summaries (Hu, Cai, Wiemer-Hastings, Graesser, & McNamara, 2007); and (3) the Euclidean distance measure, used by Rehder et al. (1998), which incorporates at the same time vector length and the cosine measures. A total of 192 Spanish middle-grade students and 6 experts took part in this study. They read an expository text and produced a short summary. Results showed significantly higher reliability of LSA as a computerized assessment tool for expository text when it used a best-dimension algorithm rather than a standard LSA algorithm. The semantic common network algorithm also showed promising results.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Adolescent
Adult
Algorithms*
Behavioral Research / methods*
Comprehension
Humans
Models, Statistical
Reading
Semantics*