Cross-Lingual Information Retrieval
10 papers with code • 0 benchmarks • 1 datasets
Cross-Lingual Information Retrieval (CLIR) is a retrieval task in which search queries and candidate documents are written in different languages. CLIR can be very useful in some scenarios. For example, a reporter may want to search foreign language news to obtain different perspectives for her story; an inventor may explore the patents in another country to understand prior art.
These leaderboards are used to track progress in Cross-Lingual Information Retrieval
In contrast, we propose an unsupervised and a very resource-light approach for measuring semantic similarity between texts in different languages.
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.
We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language.
We present CLIReval, an easy-to-use toolkit for evaluating machine translation (MT) with the proxy task of cross-lingual information retrieval (CLIR).
We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).
We, then, evaluate the impact of our cognate detection mechanism on neural machine translation (NMT), as a downstream task.