SciELO - Scientific Electronic Library Online

vol.42 número69La negociación temática en la co-construcción del conocimiento realizada por estudiantes universitariosLa dimensión de focalidad: Conceptualización, instanciación y taxonomías índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados




Links relacionados


Revista signos

versión On-line ISSN 0718-0934


OLMOS, Ricardo; LEON, José Antonio; ESCUDERO, Inmaculada  y  JORGE-BOTANA, Guillermo. An analysis of size and specificity of corpora in the assessment of summaries using LSA: A comparative study between LSA and human raters. Rev. signos [online]. 2009, vol.42, n.69, pp.71-81. ISSN 0718-0934.

Latent Semantic Analysis (LSA) is an automatic statistical method for representing the meanings of words and text passages. An emerging body of evidence supports the reliability of LSA as a tool for assessing the semantic similarities between units of discourse. LSA has also proved to be comparable to human judgments of similarities in documents. Before analyzing a linguistic corpus composed by digitized documents, this tool acquires the mathematical representation of the texts. The main objective of this study was to analyze what properties (general, condensed, diversified, and base corpus) different linguistic corpora should have so that the assessment of the summaries carried out by the LSA is as similar as possible to the assessment made by four human raters. Three hundred and ninety Spanish middle and high school students (14-16 years old) and undergraduate students read a narrative text and later summarized it. Findings indicate that the size of the corpora need not be as general and as big as those used in Boulder (made up by millions of texts and over one million words), nor do they have to be too specific (fewer than 300 texts and 5000 words) for the assessment to be satisfactorily efficient.

Palabras clave : Latent Semantic Analysis (LSA); summary; discourse assessment; linguistic corpus; university students.

        · resumen en Español     · texto en Español     · Español ( pdf )


Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons