2 code implementations • 5 Jul 2024 • Bolaji Yusuf, Jan "Honza" Černocký, Murat Saraçlar
End-to-end (E2E) keyword search (KWS) has emerged as an alternative and complimentary approach to conventional keyword search which depends on the output of automatic speech recognition (ASR) systems.
no code implementations • 5 Jul 2024 • Bolaji Yusuf, Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran
This paper explores speculative speech recognition (SSR), where we empower conventional automatic speech recognition (ASR) with speculation capabilities, allowing the recognizer to run ahead of audio.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 5 Jul 2024 • Bolaji Yusuf, Murat Saraçlar
End-to-end (E2E) approaches to keyword search (KWS) are considerably simpler in terms of training and indexing complexity when compared to approaches which use the output of automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 15 Aug 2023 • Bolaji Yusuf, Jan Cernocky, Murat Saraclar
Conventional keyword search systems operate on automatic speech recognition (ASR) outputs, which causes them to have a complex indexing and search pipeline.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 20 Mar 2023 • Bolaji Yusuf, Aditya Gourav, Ankur Gandhe, Ivan Bulyko
End-to-end speech recognition models are improved by incorporating external text sources, typically by fusion with an external language model.
no code implementations • 12 Feb 2022 • Bolaji Yusuf, Ankur Gandhe, Alex Sokolov
There has been a recent focus on training E2E ASR models that get the performance benefits of external text data without incurring the extra cost of evaluating an external language model at inference time.
1 code implementation • 23 Aug 2021 • Bolaji Yusuf, Alican Gok, Batuhan Gundogdu, Murat Saraclar
Recently, neural approaches to spoken content retrieval have become popular.
no code implementations • SIGUL (LREC) 2022 • Marcely Zanon Boito, Bolaji Yusuf, Lucas Ondel, Aline Villavicencio, Laurent Besacier
Our results suggest that neural models for speech discretization are difficult to exploit in our setting, and that it might be necessary to adapt them to limit sequence length.
no code implementations • 4 Nov 2020 • Bolaji Yusuf, Lucas Ondel, Lukas Burget, Jan Cernocky, Murat Saraclar
In the target language, we infer both the language and unit embeddings in an unsupervised manner, and in so doing, we simultaneously learn a subspace of units specific to that language and the units that dwell on it.
no code implementations • 19 May 2020 • Bolaji Yusuf, Lucas Ondel
In this paper we describe our submission to the Zerospeech 2020 challenge, where the participants are required to discover latent representations from unannotated speech, and to use those representations to perform speech synthesis, with synthesis quality used as a proxy metric for the unit quality.