Cross-Lingual Question Answering
12 papers with code • 3 benchmarks • 5 datasets
Most implemented papers
Scaling Instruction-Finetuned Language Models
We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation).
On the Cross-lingual Transferability of Monolingual Representations
This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions.
PaLM: Scaling Language Modeling with Pathways
To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Most widely-used pre-trained language models operate on sequences of tokens corresponding to word or subword units.
Rethinking embedding coupling in pre-trained language models
We re-evaluate the standard practice of sharing weights between input and output embeddings in state-of-the-art pre-trained language models.
mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models
We train a multilingual language model with 24 languages with entity representations and show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.
Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering
Coupled with the availability of large scale datasets, deep learning architectures have enabled rapid progress on the Question Answering task.
Cross-Lingual Question Answering over Knowledge Base as Reading Comprehension
We convert KB subgraphs into passages to narrow the gap between KB schemas and questions, which enables our model to benefit from recent advances in multilingual pre-trained language models (MPLMs) and cross-lingual machine reading comprehension (xMRC).
PAXQA: Generating Cross-lingual Question Answering Examples at Training Scale
This work proposes a synthetic data generation method for cross-lingual QA which leverages indirect supervision from existing parallel corpora.
Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge Distillation
Our approach seeks to enhance cross-lingual QA transfer using a high-performing multilingual model trained on a large-scale dataset, complemented by a few thousand aligned QA examples across languages.