Search Results for author: Jannis Vamvas

Found 14 papers, 13 papers with code

Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias

1 code implementation • ACL ARR May 2021 • Jannis Vamvas, Rico Sennrich

Lexical disambiguation is a major challenge for machine translation systems, especially if some senses of a word are trained less often than others.

Knowledge Distillation Machine Translation +2

Paper
Code

A Multilingual Simplified Language News Corpus

no code implementations • READI (LREC) 2022 • Renate Hauser, Jannis Vamvas, Sarah Ebling, Martin Volk

Simplified language news articles are being offered by specialized web portals in several countries.

Text Simplification

Paper
Add Code

Linear-time Minimum Bayes Risk Decoding with Reference Aggregation

2 code implementations • 6 Feb 2024 • Jannis Vamvas, Rico Sennrich

Minimum Bayes Risk (MBR) decoding is a text generation technique that has been shown to improve the quality of machine translations, but is expensive, even if a sampling-based approximation is used.

Text Generation

Paper
Code

Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect

1 code implementation • 25 Jan 2024 • Jannis Vamvas, Noëmi Aepli, Rico Sennrich

Creating neural text encoders for written Swiss German is challenging due to a dearth of training data combined with dialectal variation.

Paper
Code

Machine Translation Models are Zero-Shot Detectors of Translation Direction

1 code implementation • 12 Jan 2024 • Michelle Wastl, Jannis Vamvas, Rico Sennrich

Detecting the translation direction of parallel text has applications for machine translation training and evaluation, but also has forensic applications such as resolving plagiarism or forgery allegations.

Machine Translation NMT +1

Paper
Code

Trained MT Metrics Learn to Cope with Machine-translated References

1 code implementation • 1 Dec 2023 • Jannis Vamvas, Tobias Domhan, Sony Trenous, Rico Sennrich, Eva Hasler

Neural metrics trained on human evaluations of MT tend to correlate well with human judgments, but their behavior is not fully understood.

Paper
Code

Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models

1 code implementation • 13 Nov 2023 • Alireza Mohammadshahi, Jannis Vamvas, Rico Sennrich

Massively multilingual machine translation models allow for the translation of a large number of languages with a single model, but have limited performance on low- and very-low-resource translation directions.

Hallucination Machine Translation +1

Paper
Code

Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding

1 code implementation • 13 Sep 2023 • Rico Sennrich, Jannis Vamvas, Alireza Mohammadshahi

Experiments on the massively multilingual models M2M-100 (418M) and SMaLL-100 show that these methods suppress hallucinations and off-target translations, reducing the number of translations with segment-level chrF2 below 10 by 67-83% on average, and the number of translations with oscillatory hallucinations by 75-92% on average, across 57 tested translation directions.

Machine Translation Translation

Paper
Code

Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents

1 code implementation • 22 May 2023 • Jannis Vamvas, Rico Sennrich

Automatically highlighting words that cause semantic differences between two documents could be useful for a wide range of applications.

Contrastive Learning Language Modelling +3

Paper
Code

SwissBERT: The Multilingual Language Model for Switzerland

1 code implementation • 23 Mar 2023 • Jannis Vamvas, Johannes Graën, Rico Sennrich

We present SwissBERT, a masked language model created specifically for processing Switzerland-related text.

Language Modelling Natural Language Understanding

Paper
Code

NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures

1 code implementation • 28 Apr 2022 • Jannis Vamvas, Rico Sennrich

Being able to rank the similarity of short text segments is an interesting bonus feature of neural machine translation.

Data-to-Text Generation Machine Translation +6

Paper
Code

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

1 code implementation • ACL 2022 • Jannis Vamvas, Rico Sennrich

Omission and addition of content is a typical issue in neural machine translation.

Machine Translation Translation

Paper
Code

On the Limits of Minimal Pairs in Contrastive Evaluation

1 code implementation • EMNLP (BlackboxNLP) 2021 • Jannis Vamvas, Rico Sennrich

Minimal sentence pairs are frequently used to analyze the behavior of language models.

Sentence

Paper
Code

X-Stance: A Multilingual Multi-Target Dataset for Stance Detection

1 code implementation • 18 Mar 2020 • Jannis Vamvas, Rico Sennrich

Unlike stance detection models that have specific target issues, we use the dataset to train a single model on all the issues.

Stance Detection

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.