1 code implementation • 9 Sep 2023 • Elena Voita, Javier Ferrando, Christoforos Nalmpantis
Specifically, we focus on the OPT family of models ranging from 125m to 66b parameters and rely only on whether an FFN neuron is activated or not.
1 code implementation • 19 May 2023 • David Dale, Elena Voita, Janice Lam, Prangthip Hansanti, Christophe Ropers, Elahe Kalbassi, Cynthia Gao, Loïc Barrault, Marta R. Costa-jussà
Hallucinations in machine translation are translations that contain information completely unrelated to the input.
no code implementations • 16 Dec 2022 • David Dale, Elena Voita, Loïc Barrault, Marta R. Costa-jussà
We propose to use a method that evaluates the percentage of the source contribution to a generated translation.
2 code implementations • 10 Aug 2022 • Nuno M. Guerreiro, Elena Voita, André F. T. Martins
Although the problem of hallucinations in neural machine translation (NMT) has received some attention, research on this highly pathological phenomenon lacks solid ground.
no code implementations • EMNLP 2021 • Elena Voita, Rico Sennrich, Ivan Titov
Differently from the traditional statistical MT that decomposes the translation task into distinct separately learned components, neural machine translation uses a single neural network to model the entire translation process.
no code implementations • 1 Jan 2021 • Max Ryabinin, Artem Babenko, Elena Voita
In this work, we make the first step towards unsupervised discovery of interpretable directions in language latent spaces.
1 code implementation • ACL 2021 • Elena Voita, Rico Sennrich, Ivan Titov
We find that models trained with more data tend to rely on source information more and to have more sharp token contributions; the training process is non-monotonic with several stages of different nature.
1 code implementation • EMNLP 2020 • Max Ryabinin, Sergei Popov, Liudmila Prokhorenkova, Elena Voita
We adopt a recent method learning a representation of data in the form of a differentiable weighted graph and use it to modify the GloVe training algorithm.
2 code implementations • EMNLP 2020 • Elena Voita, Ivan Titov
Instead, we propose an alternative to the standard probes, information-theoretic probing with minimum description length (MDL).
1 code implementation • NeurIPS 2019 • Dmitrii Emelianenko, Elena Voita, Pavel Serdyukov
The dominant approach to sequence generation is to produce a sequence in some predefined order, e. g. left to right.
7 code implementations • ACL 2020 • Ivan Provilkov, Dmitrii Emelianenko, Elena Voita
Subword segmentation is widely used to address the open vocabulary problem in machine translation.
Ranked #1 on
Machine Translation
on IWSLT2017 English-Arabic
no code implementations • IJCNLP 2019 • Elena Voita, Rico Sennrich, Ivan Titov
In this work, we use canonical correlation analysis and mutual information estimators to study how information flows across Transformer layers and how this process depends on the choice of learning objective.
1 code implementation • IJCNLP 2019 • Elena Voita, Rico Sennrich, Ivan Titov
For training, the DocRepair model requires only monolingual document-level data in the target language.
1 code implementation • ACL 2019 • Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation.
1 code implementation • ACL 2019 • Elena Voita, Rico Sennrich, Ivan Titov
Though machine translation errors caused by the lack of context beyond one sentence have long been acknowledged, the development of context-aware NMT systems is hampered by several problems.
1 code implementation • WS 2018 • Mathias Müller, Annette Rios, Elena Voita, Rico Sennrich
We show that, while gains in BLEU are moderate for those systems, they outperform baselines by a large margin in terms of accuracy on our contrastive test set.
no code implementations • ACL 2018 • Elena Voita, Pavel Serdyukov, Rico Sennrich, Ivan Titov
Standard machine translation systems process sentences in isolation and hence ignore extra-sentential information, even though extended context can both prevent mistakes in ambiguous cases and improve translation coherence.