Search Results for author: Gerard I. Gállego

Found 15 papers, 9 papers with code

Pretrained Speech Encoders and Efficient Fine-tuning Methods for Speech Translation: UPC at IWSLT 2022

1 code implementation IWSLT (ACL) 2022 Ioannis Tsiamas, Gerard I. Gállego, Carlos Escolano, José Fonollosa, Marta R. Costa-jussà

We further investigate the suitability of different speech encoders (wav2vec 2. 0, HuBERT) for our models and the impact of knowledge distillation from the Machine Translation model that we use for the decoder (mBART).

Knowledge Distillation Machine Translation +2

Pushing the Limits of Zero-shot End-to-End Speech Translation

1 code implementation16 Feb 2024 Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà

The speech encoder seamlessly integrates with the MT model at inference, enabling direct translation from speech to text, across all languages supported by the MT model.

Speech-to-Text Translation Translation

SpeechAlign: a Framework for Speech Translation Alignment Evaluation

no code implementations20 Sep 2023 Belen Alastruey, Aleix Sant, Gerard I. Gállego, David Dale, Marta R. Costa-jussà

To contribute to these fields, we present SpeechAlign, a framework to evaluate the underexplored field of source-target alignment in speech models.

Speech-to-Text Translation Translation

Sign Language Translation from Instructional Videos

1 code implementation13 Apr 2023 Laia Tarrés, Gerard I. Gállego, Amanda Duarte, Jordi Torres, Xavier Giró-i-Nieto

We report a result of 8. 03 on the BLEU score, and publish the first open-source implementation of its kind to promote further advances.

Sign Language Translation

Efficient Speech Translation with Dynamic Latent Perceivers

1 code implementation28 Oct 2022 Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà

Transformers have been the dominant architecture for Speech Translation in recent years, achieving significant improvements in translation quality.

Speech-to-Text Translation Translation

Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer

1 code implementation23 May 2022 Javier Ferrando, Gerard I. Gállego, Belen Alastruey, Carlos Escolano, Marta R. Costa-jussà

In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix (what has been previously translated at a decoding step).

Machine Translation NMT +2

Measuring the Mixing of Contextual Information in the Transformer

2 code implementations8 Mar 2022 Javier Ferrando, Gerard I. Gállego, Marta R. Costa-jussà

The Transformer architecture aggregates input information through the self-attention mechanism, but there is no clear understanding of how this information is mixed across the entire model.

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

2 code implementations9 Feb 2022 Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà

Speech translation datasets provide manual segmentations of the audios, which are not available in real-world scenarios, and existing segmentation methods usually significantly reduce translation quality at inference time.

Segmentation Speech-to-Text Translation +1

Efficient Transformer for Direct Speech Translation

no code implementations7 Jul 2021 Belen Alastruey, Gerard I. Gállego, Marta R. Costa-jussà

When working with speech, we must face a problem: the sequence length of an audio input is not suitable for the Transformer.


End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021

1 code implementation ACL (IWSLT) 2021 Gerard I. Gállego, Ioannis Tsiamas, Carlos Escolano, José A. R. Fonollosa, Marta R. Costa-jussà

Our submission also uses a custom segmentation algorithm that employs pre-trained Wav2Vec 2. 0 for identifying periods of untranscribable text and can bring improvements of 2. 5 to 3 BLEU score on the IWSLT 2019 test set, as compared to the result with the given segmentation.

Ranked #2 on Speech-to-Text Translation on MuST-C EN->DE (using extra training data)

Segmentation Speech-to-Text Translation +1

Evaluating Gender Bias in Speech Translation

no code implementations LREC 2022 Marta R. Costa-jussà, Christine Basta, Gerard I. Gállego

WinoST is the speech version of WinoMT which is a MT challenge set and both follow an evaluation protocol to measure gender accuracy.


Cannot find the paper you are looking for? You can Submit a new open access paper.