COST

Resource type
Corpora
Description

Corpus of Spanish tweets for sentiment analysis. The corpus is composed by 34634 tweets, which are tagged with noisy labels. 17317 of the tweets are positive and 17317 tweets are negative, so it is a balanced corpus.

To get the corpus you have to write an email to Eugenio Martínez Cámara (emcamara@ujaen.es)

How to cite

Martínez-Cámara, E., Martín-Valdivia, M. T., Ureña-López, L. A., Mitkov, R. (2015). Polarity classification for Spanish tweets using the COST corpus. Journal of Information Science, 41(3), 263-272. DOI: 10.1177%2F0165551514566564.