no code implementations • WMT (EMNLP) 2021 • Carlos Escolano, Ioannis Tsiamas, Christine Basta, Javier Ferrando, Marta R. Costa-Jussa, José A. R. Fonollosa
We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch.
1 code implementation • 10 Apr 2024 • Igor Tufanov, Karen Hambardzumyan, Javier Ferrando, Elena Voita
We present the LM Transparency Tool (LM-TT), an open-source interactive toolkit for analyzing the internal workings of Transformer-based language models.
1 code implementation • 27 Feb 2024 • Javier Ferrando, Elena Voita
These routes can be represented as graphs where nodes correspond to token representations and edges to operations inside the network.
no code implementations • 9 Sep 2023 • Elena Voita, Javier Ferrando, Christoforos Nalmpantis
Specifically, we focus on the OPT family of models ranging from 125m to 66b parameters and rely only on whether an FFN neuron is activated or not.
1 code implementation • 5 Sep 2023 • Javier Ferrando, Matthias Sperber, Hendra Setiawan, Dominic Telaar, Saša Hasan
Behavioral testing in NLP allows fine-grained evaluation of systems by examining their linguistic capabilities through the analysis of input-output behavior.
1 code implementation • 21 May 2023 • Javier Ferrando, Gerard I. Gállego, Ioannis Tsiamas, Marta R. Costa-jussà
Language Generation Models produce words based on the previous context.
no code implementations • 6 Oct 2022 • Marta R. Costa-jussà, Eric Smith, Christophe Ropers, Daniel Licht, Jean Maillard, Javier Ferrando, Carlos Escolano
We evaluate and analyze added toxicity when translating a large evaluation dataset (HOLISTICBIAS, over 472k sentences, covering 13 demographic axes) from English into 164 languages.
1 code implementation • 23 May 2022 • Javier Ferrando, Gerard I. Gállego, Belen Alastruey, Carlos Escolano, Marta R. Costa-jussà
In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix (what has been previously translated at a decoding step).
no code implementations • ACL 2022 • Belen Alastruey, Javier Ferrando, Gerard I. Gállego, Marta R. Costa-jussà
Transformers have achieved state-of-the-art results across multiple NLP tasks.
2 code implementations • 8 Mar 2022 • Javier Ferrando, Gerard I. Gállego, Marta R. Costa-jussà
The Transformer architecture aggregates input information through the self-attention mechanism, but there is no clear understanding of how this information is mixed across the entire model.
no code implementations • Findings (EMNLP) 2021 • Javier Ferrando, Marta R. Costa-jussà
This work proposes an extensive analysis of the Transformer architecture in the Neural Machine Translation (NMT) setting.
no code implementations • 24 Dec 2020 • Marta R. Costa-jussà, Carlos Escolano, Christine Basta, Javier Ferrando, Roser Batlle, Ksenia Kharitonova
Multilingual Neural Machine Translation architectures mainly differ in the amount of sharing modules and parameters among languages.
1 code implementation • 16 Jun 2020 • Javier Ferrando, Juan Luis Dominguez, Jordi Torres, Raul Garcia, David Garcia, Daniel Garrido, Jordi Cortada, Mateo Valero
This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions.
Ranked #1 on Multi-Modal Document Classification on Tobacco-3482