Cross-Lingual Information Retrieval

10 papers with code • 0 benchmarks • 1 datasets

Cross-Lingual Information Retrieval (CLIR) is a retrieval task in which search queries and candidate documents are written in different languages. CLIR can be very useful in some scenarios. For example, a reporter may want to search foreign language news to obtain different perspectives for her story; an inventor may explore the patents in another country to understand prior art.

Datasets


Most implemented papers

A Resource-Light Method for Cross-Lingual Semantic Textual Similarity

gg42554/cl-sts 19 Jan 2018

In contrast, we propose an unsupervised and a very resource-light approach for measuring semantic similarity between texts in different languages.

Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only

rlitschk/UnsupCLIR 2 May 2018

We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.

Cross-lingual Information Retrieval with BERT

MulukenW/test LREC 2020

Multiple neural language models have been developed recently, e. g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking.

Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce

LiuChang97/CLMN 17 May 2020

We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language.

CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task

ssun32/CLIReval ACL 2020

We present CLIReval, an easy-to-use toolkit for evaluating machine translation (MT) with the proxy task of cross-lingual information retrieval (CLIR).

Domain Transfer based Data Augmentation for Neural Query Translation

starryskyyl/dtda COLING 2020

Query translation (QT) serves as a critical factor in successful cross-lingual information retrieval (CLIR).

Macro-Average: Rare Types Are Important Too

thammegowda/007-mt-eval-macro NAACL 2021

While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy.

Cognition-aware Cognate Detection

prashantksharma/CaCD EACL 2021

We collect gaze behaviour data for a small sample of cognates and show that extracted cognitive features help the task of cognate detection.

Learning Cross-Lingual IR from an English Retriever

primeqa/primeqa NAACL 2022

We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).

Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages

dipteshkanojia/challengeCognateFF COLING 2020

We, then, evaluate the impact of our cognate detection mechanism on neural machine translation (NMT), as a downstream task.