no code implementations • WMT (EMNLP) 2021 • Carlos Escolano, Ioannis Tsiamas, Christine Basta, Javier Ferrando, Marta R. Costa-Jussa, José A. R. Fonollosa
We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch.
no code implementations • WMT (EMNLP) 2020 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa
In this article, we describe the TALP-UPC participation in the WMT20 news translation shared task for Tamil-English.
1 code implementation • ACL (WebNLG, INLG) 2020 • Oriol Domingo, David Bergés, Roser Cantenys, Roger Creus, José A. R. Fonollosa
establishes key guidelines on how, which and when Machine Translation (MT) techniques are worth applying to RDF-to-Text task.
no code implementations • ACL (WebNLG, INLG) 2020 • David Bergés, Roser Cantenys, Roger Creus, Oriol Domingo, José A. R. Fonollosa
This work describes the end-to-end system architecture presented at WebNLG Challenge 2020.
1 code implementation • 16 Feb 2024 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
The speech encoder seamlessly integrates with the MT model at inference, enabling direct translation from speech to text, across all languages supported by the MT model.
1 code implementation • 29 Sep 2023 • Casimiro Pio Carrino, Carlos Escolano, José A. R. Fonollosa
Our approach seeks to enhance cross-lingual QA transfer using a high-performing multilingual model trained on a large-scale dataset, complemented by a few thousand aligned QA examples across languages.
no code implementations • 2 Jun 2023 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
Our Speech Translation systems utilize foundation models for speech (wav2vec 2. 0) and text (mBART50).
1 code implementation • 19 Dec 2022 • Ioannis Tsiamas, José A. R. Fonollosa, Marta R. Costa-jussà
End-to-end Speech Translation is hindered by a lack of available data resources.
1 code implementation • 28 Oct 2022 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
Transformers have been the dominant architecture for Speech Translation in recent years, achieving significant improvements in translation quality.
2 code implementations • 9 Feb 2022 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
Speech translation datasets provide manual segmentations of the audios, which are not available in real-world scenarios, and existing segmentation methods usually significantly reduce translation quality at inference time.
1 code implementation • ACL (IWSLT) 2021 • Gerard I. Gállego, Ioannis Tsiamas, Carlos Escolano, José A. R. Fonollosa, Marta R. Costa-jussà
Our submission also uses a custom segmentation algorithm that employs pre-trained Wav2Vec 2. 0 for identifying periods of untranscribable text and can bring improvements of 2. 5 to 3 BLEU score on the IWSLT 2019 test set, as compared to the result with the given segmentation.
Ranked #2 on Speech-to-Text Translation on MuST-C EN->DE (using extra training data)
no code implementations • 2 Nov 2020 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa, Carlos Segura
On the other hand, Multilingual Neural Machine Translation (MultiNMT) approaches rely on higher-quality and more massive data sets.
no code implementations • 29 May 2020 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa, Mikel Artetxe
We propose a modular architecture of language-specific encoder-decoders that constitutes a multilingual machine translation system that can be incrementally extended to new languages without the need for retraining the existing system when adding new languages.
no code implementations • EACL 2021 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa, Mikel Artetxe
State-of-the-art multilingual machine translation relies on a universal encoder-decoder, which requires retraining the entire system to add new languages.
no code implementations • EMNLP (spnlp) 2020 • Noe Casas, José A. R. Fonollosa, Marta R. Costa-jussà
The dominant language modeling paradigm handles text as a sequence of discrete tokens.
3 code implementations • 11 Dec 2019 • Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
We then used this dataset to train Spanish QA systems by fine-tuning a Multilingual-BERT model.
no code implementations • ACL 2019 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa
Multilingual Neural Machine Translation approaches are based on the use of task-specific models and the addition of one more language can only be done by retraining the whole system.
2 code implementations • 16 May 2019 • José A. R. Fonollosa, Noe Casas, Marta R. Costa-jussà
The dominant neural machine translation models are based on the encoder-decoder structure, and many of them rely on an unconstrained receptive field over source and target sequences.
Ranked #10 on Machine Translation on WMT2014 English-French
no code implementations • 15 May 2019 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa
By adding and forcing this interlingual loss, we are able to train multiple encoders and decoders for each language, sharing a common intermediate representation.
no code implementations • 15 Oct 2018 • Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa
Preliminary results on the WMT 2017 Turkish/English task shows that the proposed architecture is capable of learning a universal language representation and simultaneously training both translation directions with state-of-the-art results.
1 code implementation • WS 2017 • Han Yang, Marta R. Costa-jussà, José A. R. Fonollosa
Natural language inference (NLI) is a central problem in language understanding.
no code implementations • 2 Mar 2016 • Marta R. Costa-jussà, José A. R. Fonollosa
Neural Machine Translation (MT) has reached state-of-the-art results.
no code implementations • 25 Jan 2016 • José A. R. Fonollosa
In this paper we derive variability measures for the conditional probability distributions of a pair of random variables, and we study its application in the inference of causal-effect relationships.