Multimodal Information Retrieval based on knowledge integration

Manuel Carlos Díaz Galiano. April 2011

This study aims to integrate knowledge and filtering techniques in order to improve Multimodal Information Retrieval systems. Traditional Information Retrieval (IR) Systems are primarily concerned with dealing with textual information. However, the amount of electronic information available today is not only textual, but rather multimodal. By multimodality we mean any format including textual information, images, video or audio, and in most cases we usually find mixed information.

There are specialized systems dealing with the extraction of textual information in different formats. Examples include the Content Based Image Recovery (CBIR) systems, systems which extract video features and systems that transcribe conversations to text. In most of these the information obtained is finally expressed in text, so that in the end traditional text processing techniques are often used . A multimodal system is a system that retrieves information from large collections in various formats.

This can exploit the advantages of various specialized systems. This multimodality allows, for example, CBIR systems to improve using textual information that appears next to images. These systems are useful for different types of professionals who need to work with formats other than text. Within this area we can consider medical work, which generates large volumes of information on each clinical case, including text and images from the various tests.

This study examines how a multimodal system is affected by filtering and including knowledge specific to the textual information available. For this multimodal corpus are used. These are available in the various evaluation forums of these systems. We will focus on the corpus provided by ImageCLEFmed, as it concerns a more specific environment such as healthcare.

(Link TESEO)