Resource type:



The Drug Opinions Spanish (DOS) corpus was sourced from the web portal, which is an independent platform for sharing experiences with drugs. It is composed of 877 opinions about the 30 most reviewed drugs by March 14, 2017. Each review contains information about the date in which it was posted, the gender and age of the consumer, the disease and the drug used for it, the textual opinion and a rating for the following satisfaction categories: overall, efficacy, side effects quantity, side effects severity and ease of use. Moreover, each review was manually annotated at aspect-level with the side effects described in them and with an opinion polarity label and an opinion intensity label according to the patients’ experiences. The corpus has 3,784 sentences containing a total of 2,230 side effects, out of which 98 are positive, 2,119 negative and 13 neutral. Regarding the intensity of the side effects, 655 are of high intensity, 1,486 of medium intensity and 89 of low intensity.

How to cite:

Jiménez-Zafra, S. M.,Martín-Valdivia, M. T., Molina-González, M. D. & Ureña-López, L. A. (2017). Corpus Annotation for Aspect Based Sentiment Analysis in Medical Domain. Proceedings of the 2nd International Workshop on Extraction and Processing of Rich Semantics from Medical Texts

Files of the resource:

For any questions related to the corpus, please send an email to Salud María Jiménez-Zafra or M. Teresa Martín-Valdivia.