no code implementations • 19 May 2023 • Tatiana Likhomanenko, Loren Lugosch, Ronan Collobert
Here, "unsupervised" means no labeled audio is available for the $\textit{target}$ language.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 30 Oct 2021 • Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert
Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems.
4 code implementations • 8 Jun 2021 • Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, Ju-chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Chien-Feng Liao, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato de Mori, Yoshua Bengio
SpeechBrain is an open-source and all-in-one speech toolkit.
2 code implementations • 4 Apr 2021 • Loren Lugosch, Piyush Papreja, Mirco Ravanelli, Abdelwahab Heba, Titouan Parcollet
This paper introduces Timers and Such, a new open source dataset of spoken English commands for common voice control use cases involving numbers.
Ranked #4 on
Spoken Language Understanding
on Timers and Such
(using extra training data)
1 code implementation • 2 Jun 2020 • Loren Lugosch, Derek Nowrouzezahrai, Brett H. Meyer
The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty.
3 code implementations • 21 Oct 2019 • Loren Lugosch, Brett Meyer, Derek Nowrouzezahrai, Mirco Ravanelli
End-to-end models are an attractive new approach to spoken language understanding (SLU) in which the meaning of an utterance is inferred directly from the raw audio without employing the standard pipeline composed of a separately trained speech recognizer and natural language understanding module.
Ranked #7 on
Spoken Language Understanding
on Snips-SmartLights
2 code implementations • 7 Apr 2019 • Loren Lugosch, Mirco Ravanelli, Patrick Ignoto, Vikrant Singh Tomar, Yoshua Bengio
Whereas conventional spoken language understanding (SLU) systems map speech to text, and then text to intent, end-to-end SLU systems map speech directly to intent through a single trainable model.
Ranked #15 on
Spoken Language Understanding
on Fluent Speech Commands
(using extra training data)
1 code implementation • 26 Nov 2018 • Loren Lugosch, Samuel Myer, Vikrant Singh Tomar
Keyword spotting--or wakeword detection--is an essential feature for hands-free operation of modern voice-controlled devices.
1 code implementation • 23 Oct 2018 • Loren Lugosch, Warren J. Gross
In this paper, we introduce the syndrome loss, an alternative loss function for neural error-correcting decoders based on a relaxation of the syndrome.
2 code implementations • 21 Jun 2017 • Eliya Nachmani, Elad Marciano, Loren Lugosch, Warren J. Gross, David Burshtein, Yair Beery
Furthermore, we demonstrate that the neural belief propagation decoder can be used to improve the performance, or alternatively reduce the computational complexity, of a close to optimal decoder of short BCH codes.
1 code implementation • 20 Jan 2017 • Loren Lugosch, Warren J. Gross
After describing our method, we compare the performance of the two neural decoding algorithms and show that our method achieves error-correction performance within 0. 1 dB of the multiplicative approach and as much as 1 dB better than traditional belief propagation for the codes under consideration.