Monolingual, Multilingual and Distributed Information Retrieval

Systems for information retrieval (IR) are responsible for selecting and retrieving documents that are relevant to the information required by users. As a result these systems return a list of relevant documents, usually in the order of values that measure the validity of this document to answer the information needs of the user.

In the last decade, interest in developing systems for multilingual information retrieval (CLIR – Cross Lingual Information Retrieval), has grown dramatically (Grefenstette, 1998). A CLIR system is an information recovery system capable of operating on a collection of multilingual documents.

The search engines available on the Web or in large corporations are typically based on a single document base, a local copy of the other accessible collections. In any case, if not all documents are available in order to proceed to copy and index them in a centralized manner, this approach is no longer valid. Such is the case with large corporations which usually have large, widely-distributed collections, or the Internet, where most of the information is generated dynamically, the reason why it is not accessible using traditional search engines. This is the basic motivation of distributed information retrieval systems.