SciELO - Scientific Electronic Library Online

vol.48 número87Teorías implícitas acerca de la comprensión de textos: Estudio exploratorio en estudiantes universitarios de primer añoEl resumen del artículo de investigación: Análisis del género en un corpus de textos de Enfermería índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados




Links relacionados

  • En proceso de indezaciónCitado por Google
  • No hay articulos similaresSimilares en SciELO
  • En proceso de indezaciónSimilares en Google


Revista signos

versión On-line ISSN 0718-0934


PEREZ-TELLEZ, Fernando; CARDIFF, John; ROSSO, Paolo  y  PINTO, David. Disambiguating company names in microblog text using clustering for online reputation management. Rev. signos [online]. 2015, vol.48, n.87, pp.54-77. ISSN 0718-0934.

Twitter is used by millions of users to publish brief messages (tweets) with the purpose of sharing experiences and/or opinions about a product or service. There is a clear need for systems that can mine these messages in order to derive information about the collective thinking of twitterers (e.g. for opinion or sentiment analysis). Tweet analysis is a very important task because comments, opinions, suggestions, complaints etc. can be used for marketing strategies or for determining information on a company’s reputation. For this purpose, it is necessary to automatically establish whether a tweet refers to a company or not, when the company name is ambiguous. This task is not a straightforward keyword search process as there may be multiple contexts in which a name can be used. The aim of this study is to present and compare four different approaches which improve the representation of short texts for better performance of the clustering task that determine whether a given tweet refers to a particular company or not. For this purpose, we have used a variety of enriching methodologies based on term expansion via the semantic similarity hidden behind the lexical structure, in order to improve the representation of tweets and as a consequence the performance of the task. We have used two different tweet datasets of company names which contain different levels of ambiguity. The results are promising although they highlight the difficulty of this task.

Palabras clave : Clustering of tweets; opinion analysis; disambiguation; online reputation management.

        · resumen en Español     · texto en Inglés     · Inglés ( pdf )


Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons