Corpus of Spanish tweets for sentiment analysis. The corpus is composed by 34634 tweets, which are tagged with noisy labels. 17317 of the tweets are positive and 17317 tweets are negative, so it is a balanced corpus.

How to cite:

Martínez-Cámara, E., Martín-Valdivia, M. T., Ureña-López, L. A., Mitkov, R. (2015). Polarity classification for Spanish tweets using the COST corpus. Journal of Information Science, 41(3), 263-272. DOI: 10.1177%2F0165551514566564.

Resource files:

To get the corpus you have to write an email to Eugenio Martínez Cámara (


