no code implementations • 29 Jun 2024 • Mirco Ravanelli, Titouan Parcollet, Adel Moumen, Sylvain de Langen, Cem Subakan, Peter Plantinga, Yingzhi Wang, Pooneh Mousavi, Luca Della Libera, Artem Ploujnikov, Francesco Paissan, Davide Borra, Salah Zaiem, Zeyu Zhao, Shucong Zhang, Georgios Karakasidis, Sung-Lin Yeh, Pierre Champion, Aku Rouhe, Rudolf Braun, Florian Mai, Juan Zuluaga-Gomez, Seyed Mahed Mousavi, Andreas Nautsch, Xuechen Liu, Sangeet Sagar, Jarod Duret, Salima Mdhaffar, Gaelle Laperriere, Mickael Rouvier, Renato de Mori, Yannick Esteve
This paper presents SpeechBrain 1. 0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face.
no code implementations • 11 Sep 2023 • Titouan Parcollet, Ha Nguyen, Solene Evain, Marcely Zanon Boito, Adrien Pupier, Salima Mdhaffar, Hang Le, Sina Alisamir, Natalia Tomashenko, Marco Dinarelli, Shucong Zhang, Alexandre Allauzen, Maximin Coavoux, Yannick Esteve, Mickael Rouvier, Jerome Goulian, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing.
1 code implementation • 23 Apr 2021 • Solene Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Esteve, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
In this paper, we propose LeBenchmark: a reproducible framework for assessing SSL from speech.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
no code implementations • 15 Mar 2020 • Natalia Tomashenko, Yuri Khokhlov, Yannick Esteve
Experimental results on the TED-LIUM corpus show that the proposed adaptation technique can be effectively integrated into DNN and TDNN setups at different levels and provide additional gain in recognition performance: up to 6% of relative word error rate reduction (WERR) over the strong feature-space adaptation techniques based on maximum likelihood linear regression (fMLLR) speaker adapted DNN baseline, and up to 18% of relative WERR in comparison with a speaker independent (SI) DNN baseline model, trained on conventional features.
no code implementations • 14 Feb 2020 • Natalia Tomashenko, Christian Raymond, Antoine Caubriere, Renato de Mori, Yannick Esteve
The dialog history is represented in the form of dialog history embedding vectors (so-called h-vectors) and is provided as an additional information to end-to-end SLU models in order to improve the system performance.
no code implementations • EMNLP (IWSLT) 2019 • Ha Nguyen, Natalia Tomashenko, Marcely Zanon Boito, Antoine Caubriere, Fethi Bougares, Mickael Rouvier, Laurent Besacier, Yannick Esteve
This paper describes the ON-TRAC Consortium translation systems developed for the end-to-end model task of IWSLT Evaluation 2019 for the English-to-Portuguese language pair.
no code implementations • 29 Sep 2019 • Natalia Tomashenko, Antoine Caubriere, Yannick Esteve, Antoine Laurent, Emmanuel Morin
This work investigates spoken language understanding (SLU) systems in the scenario when the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model.