Prefix Embeddings for In-context Machine Translation

AMTA 2022  ·  Suzanna Sia, Kevin Duh ·

Very large language models have been shown to translate with few-shot in-context examples. However, they have not achieved state-of-art results for translating out of English. In this work, we investigate an extremely lightweight fixed-parameter method for conditioning a large language model to better translate into the target language. Our method introduces additional embeddings, known as prefix embeddings which do not interfere with the existing weights of the model. Using unsupervised and weakly semi-supervised methods that train only 0.0001% of the model parameters, the simple method improves ~0.2-1.3 BLEU points across 3 domains and 3 languages. We analyze the resulting embeddings’ training dynamics, and where they lie in the embedding space, and show that our trained embeddings can be used for both in-context translation, and diverse generation of the target sentence.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here