1 code implementation • 11 Dec 2024 • LCM team, Loïc Barrault, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria, Guillaume Couairon, Marta R. Costa-jussà, David Dale, Hady Elsahar, Kevin Heffernan, João Maria Janeiro, Tuan Tran, Christophe Ropers, Eduardo Sánchez, Robin San Roman, Alexandre Mourachko, Safiyyah Saleem, Holger Schwenk
In this paper, we present an attempt at an architecture which operates on an explicit higher-level semantic representation, which we name a concept.
no code implementations • 11 Dec 2024 • Marta R. Costa-jussà, Bokai Yu, Pierre Andrews, Belen Alastruey, Necati Cihan Camgoz, Joe Chuang, Jean Maillard, Christophe Ropers, Arina Turkantenko, Carleigh Wood
We introduce the first highly multilingual speech and American Sign Language (ASL) comprehension dataset by extending BELEBELE.
no code implementations • 26 Sep 2024 • Belen Alastruey, Gerard I. Gállego, Marta R. Costa-jussà
Hence, we hypothesize that this issue stems from the difficulty of effectively training an encoder for direct speech translation.
1 code implementation • 18 Sep 2024 • Eduardo Sánchez, Belen Alastruey, Christophe Ropers, Pontus Stenetorp, Mikel Artetxe, Marta R. Costa-jussà
We propose a new benchmark to measure a language model's linguistic reasoning skills without relying on pre-existing language-specific knowledge.
1 code implementation • 19 Oct 2023 • Belen Alastruey, Matthias Sperber, Christian Gollan, Dominic Telaar, Tim Ng, Aashish Agarwal
Code-switching (CS), i. e. mixing different languages in a single sentence, is a common phenomenon in communication and can be challenging in many Natural Language Processing (NLP) settings.
no code implementations • 20 Sep 2023 • Belen Alastruey, Aleix Sant, Gerard I. Gállego, David Dale, Marta R. Costa-jussà
In doing so, we contribute to the ongoing research progress within the fields of Speech-to-Speech and Speech-to-Text translation.
1 code implementation • 31 Aug 2023 • Benjamin Muller, Belen Alastruey, Prangthip Hansanti, Elahe Kalbassi, Christophe Ropers, Eric Michael Smith, Adina Williams, Luke Zettlemoyer, Pierre Andrews, Marta R. Costa-jussà
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
no code implementations • 12 Jun 2023 • Belen Alastruey, Lukas Drude, Jahn Heymann, Simon Wiesler
Convolutional frontends are a typical choice for Transformer-based automatic speech recognition to preprocess the spectrogram, reduce its sequence length, and combine local information in time and frequency similarly.
1 code implementation • 23 May 2022 • Javier Ferrando, Gerard I. Gállego, Belen Alastruey, Carlos Escolano, Marta R. Costa-jussà
In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix (what has been previously translated at a decoding step).
no code implementations • NAACL (ACL) 2022 • Gerard Sant, Gerard I. Gállego, Belen Alastruey, Marta R. Costa-jussà
Different approaches have been proposed to overcome these problems, such as the use of efficient attention mechanisms.
no code implementations • ACL 2022 • Belen Alastruey, Javier Ferrando, Gerard I. Gállego, Marta R. Costa-jussà
Transformers have achieved state-of-the-art results across multiple NLP tasks.
no code implementations • 7 Jul 2021 • Belen Alastruey, Gerard I. Gállego, Marta R. Costa-jussà
When working with speech, we must face a problem: the sequence length of an audio input is not suitable for the Transformer.