OffendES_spans: A Corpus in Spanish for Offensive Span Identification


OffendES_spans is an Spanish corpus created in the spirit of the original OffendES dataset, but including offensive spans automatically labeled using the SHARE lexicon resource of offensive terms and expressions. The corpora consist of 11,035 comments are annotated with offensive spans.


  • The resource is available free for research purposes.
  • Do not redistribute the data.
  • SINAI disclaims any responsibility for the use of the lexicon and does not provide technical support. However, the following contacts will be happy to respond to queries and clarifications:,


How to cite

If you use this resource, please cite the following paper:

    title = "{O}ffend{ES}: A New Corpus in {S}panish for Offensive Language Research",
    author = "Plaza-del-Arco, Flor Miriam  and
      Montejo-R{\'a}ez, Arturo  and
      Ure{\~n}a-L{\'o}pez, L. Alfonso  and
      Mart{\'\i}n-Valdivia, Mar{\'\i}a-Teresa",
    booktitle = "Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)",
    month = sep,
    year = "2021",
    address = "Held Online",
    publisher = "INCOMA Ltd.",
    url = "",
    pages = "1096--1108",