Resource type
Corpora
Description
EVOCA (English Version of OCA) is an English corpus generated from the translation of the Arabic corpus OCA. This corpus contains reviews of movies and is divided into 250 positive reviews and 250 negative. Some statistics on EVOCA corpus. This corpus was translated in April 2011. Some statistics on it are shown in the following table:
Negative | Positive | |
---|---|---|
Total documents | 250 | 250 |
Total tokens | 122.135 | 153.581 |
Average tokens in each comment | 488,54 | 614,32 |
Total sentences | 5.030 | 3.483 |
Average sentence in each comment | 20,12 | 13,93 |
How to cite
Rushdi Saleh, M., Martín-Valdivia, M. T., Ureña-López, L. A. & Perea-Ortega, J. M. (2011). Bilingual Experiments with an Arabic-English Corpus for Opinion Mining. Proceedings of Recent Advances in Natural Language Processing, pages 740–745.
For any questions on the corpus sends an email to Mohammed Saleh or José M. Perea
Resource files