Research lines

Entity Recognition

A Named Entity Recognition system (NER) tries to find within a text or document those simple sentences that directly respond to simple questions (who?, how?, where? …).

Monolingual, Multilingual and Distributed Information Retrieval

Systems for information retrieval (IR) are responsible for selecting and retrieving documents that are relevant to the information required by users. As a result these systems return a list of relevant documents, usually in the order of values that measure the validity of this document to answer the information needs of the user.

Multimodal information retrieval

Currently, there is a huge amount of unstructured information available online, on the public web and in the “hidden” web (intranets, digital libraries, etc..). This information can be both visual and textual, and found in all kinds of multimedia documents (video, images, audio, transcripts of conferences …). Information retrieval on such varied collections presents challenges like merging or indexing.

Opinion Mining

Opinion Mining aims to bring the principles of data mining (discovery of relationships, classes, etc..) to analysis of product reviews and reviews of blogs and other collaborative environments. It attempts to analyze the polarity in the opinion of the author of a comment in order to extract a review thereof. This discipline is of considerable interest in e-commerce systems, but its scope is much broader.

Question Answering

A Question Answering (QA) can be defined as a system that automatically finds concrete answers to user queries. These systems are very useful in cases where the user needs to know specific data and does not want to review all the documentation related to the topic for that data.

Recommender systems

Recommender systems are oriented towards the consumer by suggesting products that may be of their interest. In our group we work to improve current collaborative filtering systems by adding analysis of the components of human language.

Text Categorization

Automated Text Categorization (ATC) involves the automatic classification of documents into predefined categories.

Word Sense Disambiguation

Disambiguation (Word Sense Disambiguation, WSD) is the identification of the meaning of a word in a given context within a given set of candidates. Disambiguation is not an end in itself, but it is a very necessary intermediate task for some Natural Language Processing (NLP) Tasks.