Search Results for author: Asa Cooper Stickland

Found 11 papers, 7 papers with code

Regularising Fisher Information Improves Cross-lingual Generalisation

no code implementations EMNLP (MRL) 2021 Asa Cooper Stickland, Iain Murray

Many recent works use ‘consistency regularisation’ to improve the generalisation of fine-tuned pre-trained models, both multilingual and English-only.

Memorization

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

1 code implementation20 Nov 2023 David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, Samuel R. Bowman

We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry.

Multiple-choice

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

1 code implementation21 Sep 2023 Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans

If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A".

Data Augmentation Sentence

Robustification of Multilingual Language Models to Real-world Noise in Crosslingual Zero-shot Settings with Robust Contrastive Pretraining

1 code implementation10 Oct 2022 Asa Cooper Stickland, Sailik Sengupta, Jason Krone, Saab Mansour, He He

To benchmark the performance of pretrained multilingual language models, we construct noisy datasets covering five languages and four NLP tasks and observe a clear gap in the performance between clean and noisy data in the zero-shot cross-lingual setting.

Data Augmentation Pretrained Multilingual Language Models +1

When does Parameter-Efficient Transfer Learning Work for Machine Translation?

1 code implementation23 May 2022 Ahmet Üstün, Asa Cooper Stickland

We find that using PEFTs with a larger pre-trained model outperforms full fine-tuning with a smaller model, and for smaller training data sizes, PEFTs outperform full fine-tuning for the same pre-trained model.

Machine Translation Transfer Learning +1

Deep Transformers with Latent Depth

1 code implementation NeurIPS 2020 Xi-An Li, Asa Cooper Stickland, Yuqing Tang, Xiang Kong

As an extension of this framework, we propose a novel method to train one shared Transformer network for multilingual machine translation with different layer selection posteriors for each language pair.

Language Modelling Machine Translation +2

Diverse Ensembles Improve Calibration

no code implementations8 Jul 2020 Asa Cooper Stickland, Iain Murray

Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched.

Data Augmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.