Cross-Lingual Information Retrieval
13 papers with code • 0 benchmarks • 1 datasets
Cross-Lingual Information Retrieval (CLIR) is a retrieval task in which search queries and candidate documents are written in different languages. CLIR can be very useful in some scenarios. For example, a reporter may want to search foreign language news to obtain different perspectives for her story; an inventor may explore the patents in another country to understand prior art.
Benchmarks
These leaderboards are used to track progress in Cross-Lingual Information Retrieval
Most implemented papers
A Resource-Light Method for Cross-Lingual Semantic Textual Similarity
In contrast, we propose an unsupervised and a very resource-light approach for measuring semantic similarity between texts in different languages.
Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.
Cross-lingual Information Retrieval with BERT
Multiple neural language models have been developed recently, e. g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking.
Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce
We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language.
CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task
We present CLIReval, an easy-to-use toolkit for evaluating machine translation (MT) with the proxy task of cross-lingual information retrieval (CLIR).
Macro-Average: Rare Types Are Important Too
While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy.
Cognition-aware Cognate Detection
We collect gaze behaviour data for a small sample of cognates and show that extracted cognitive features help the task of cognate detection.
Learning Cross-Lingual IR from an English Retriever
We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).
Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages
We, then, evaluate the impact of our cognate detection mechanism on neural machine translation (NMT), as a downstream task.
CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval
Given the absence of cross-lingual information retrieval datasets with claim-like queries, we train the retriever with our proposed Cross-lingual Inverse Cloze Task (X-ICT), a self-supervised algorithm that creates training instances by translating the title of a passage.