SHARE: A Lexicon of Harmful Expressions by the Spanish Population

Resource type
Lexicon
Description

SHARE is a new lexical resource with 10,125 offensive terms and expressions collected from the Spanish population. We retrieve this terminology using an existing chatbot developed to engage a conversation with users and collect insults via Telegram, named Fiero. The resource comprises 5,888 offensive unigrams, 2,447 bigrams, and 1,790 trigrams. This vocabulary has been manually labeled by five annotators obtaining a kappa coefficient agreement of 78.8%.

TERMS OF USE:

  • The resource is available free for research purposes.
  • Do not redistribute the data.
  • SINAI disclaims any responsibility for the use of the lexicon and does not provide technical support. However, the following contacts will be happy to respond to queries and clarifications: fmplaza@ujaen.es, maite@ujaen.es