no code implementations • 8 Jul 2024 • Jarod Duret, Mickael Rouvier, Yannick Estève
In this work, we detail our submission to the 2024 edition of the MSP-Podcast Speech Emotion Recognition (SER) Challenge.
no code implementations • 8 Jul 2024 • Jarod Duret, Yannick Estève, Titouan Parcollet
Recent advancements in textless speech-to-speech translation systems have been driven by the adoption of self-supervised learning techniques.
no code implementations • 29 Jun 2024 • Mirco Ravanelli, Titouan Parcollet, Adel Moumen, Sylvain de Langen, Cem Subakan, Peter Plantinga, Yingzhi Wang, Pooneh Mousavi, Luca Della Libera, Artem Ploujnikov, Francesco Paissan, Davide Borra, Salah Zaiem, Zeyu Zhao, Shucong Zhang, Georgios Karakasidis, Sung-Lin Yeh, Pierre Champion, Aku Rouhe, Rudolf Braun, Florian Mai, Juan Zuluaga-Gomez, Seyed Mahed Mousavi, Andreas Nautsch, Xuechen Liu, Sangeet Sagar, Jarod Duret, Salima Mdhaffar, Gaelle Laperriere, Mickael Rouvier, Renato de Mori, Yannick Esteve
This paper presents SpeechBrain 1. 0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face.
no code implementations • 20 Jun 2024 • Pooneh Mousavi, Luca Della Libera, Jarod Duret, Artem Ploujnikov, Cem Subakan, Mirco Ravanelli
Discrete audio tokens have recently gained considerable attention for their potential to connect audio and language processing, enabling the creation of modern multimodal large language models.
no code implementations • 15 Jun 2024 • Pooneh Mousavi, Jarod Duret, Salah Zaiem, Luca Della Libera, Artem Ploujnikov, Cem Subakan, Mirco Ravanelli
Discrete audio tokens have recently gained attention for their potential to bridge the gap between audio and language processing.
no code implementations • 11 Oct 2023 • Jarod Duret, Benjamin O'Brien, Yannick Estève, Titouan Parcollet
Textless speech-to-speech translation systems are rapidly advancing, thanks to the integration of self-supervised learning techniques.
no code implementations • 14 Sep 2023 • Victoria Mingote, Pablo Gimeno, Luis Vicente, Sameer Khurana, Antoine Laurent, Jarod Duret
This framework employs text in different source languages as input to generate speech in the target language without the need for text transcriptions in this language.
no code implementations • 29 Jun 2023 • Jarod Duret, Titouan Parcollet, Yannick Estève
We propose a method for speech-to-speech emotionpreserving translation that operates at the level of discrete speech units.
no code implementations • 2 Apr 2022 • Salima Mdhaffar, Jarod Duret, Titouan Parcollet, Yannick Estève
Our approach is based on the use of an external model trained to generate a sequence of vectorial representations from text.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 10 May 2021 • Mickael Rouvier, Pierre-Michel Bousquet, Jarod Duret
The x-vector architecture has recently achieved state-of-the-art results on the speaker verification task.