no code implementations • 17 Jan 2024 • Pieter De Clercq, Corentin Puffay, Jill Kries, Hugo Van hamme, Maaike Vandermosten, Tom Francart, Jonas Vanthornhout
We modeled electroencephalography (EEG) responses to acoustic, segmentation, and linguistic speech representations of a story using convolutional neural networks trained on a large sample of healthy participants, serving as a model for intact neural tracking of speech.
no code implementations • 25 Sep 2023 • Jakob Poncelet, Hugo Van hamme
Self-supervised pre-trained speech models have strongly improved speech recognition, yet they are still sensitive to domain shifts and accented or atypical speech.
1 code implementation • 23 Aug 2023 • Bastiaan Tamm, Rik Vandenberghe, Hugo Van hamme
In online conferencing applications, estimating the perceived quality of an audio signal is crucial to ensure high quality of experience for the end user.
no code implementations • 31 Jul 2023 • Mohammad Jalilpour Monesi, Jonas Vanthornhout, Hugo Van hamme, Tom Francart
Our results show that vowel-consonant onsets outperform onsets of any phone in both tasks, which suggests that neural tracking of the vowel vs. consonant exists in the EEG to some degree.
1 code implementation • 19 Jun 2023 • Steven Vander Eeckt, Hugo Van hamme
Fine-tuning an Automatic Speech Recognition (ASR) model to new domains results in degradation on original domains, referred to as Catastrophic Forgetting (CF).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 Jun 2023 • Jinzi Qi, Hugo Van hamme
In dysarthric speech recognition, data scarcity and the vast diversity between dysarthric speakers pose significant challenges.
no code implementations • 6 Mar 2023 • Bastiaan Tamm, Rik Vandenberghe, Hugo Van hamme
To fill this gap, the ADReSS-M challenge was organized.
no code implementations • 3 Feb 2023 • Corentin Puffay, Bernd Accou, Lies Bollens, Mohammad Jalilpour Monesi, Jonas Vanthornhout, Hugo Van hamme, Tom Francart
Linear models are presently used to relate the EEG recording to the corresponding speech signal.
1 code implementation • 9 Jan 2023 • Bernd Accou, Hugo Van hamme, Tom Francart
We propose a novel paradigm for the self-supervised enhancement of stimulus-related brain response data.
no code implementations • 24 Nov 2022 • Quentin Meeus, Marie-Francine Moens, Hugo Van hamme
We explore the benefits that multitask learning offer to speech processing as we train models on dual objectives with automatic speech recognition and intent classification or sentiment classification.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7
no code implementations • 24 Nov 2022 • Quentin Meeus, Marie-Francine Moens, Hugo Van hamme
Class attention can be used to visually explain the predictions of our model, which goes a long way in understanding how the model makes predictions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 18 Nov 2022 • Wim Boes, Hugo Van hamme
More specifically, visual features focusing on semantics appear appropriate in the context of automated audio captioning, while for sound event detection, time information seems to be more important.
no code implementations • 27 Oct 2022 • Steven Vander Eeckt, Hugo Van hamme
Adapting a trained Automatic Speech Recognition (ASR) model to new tasks results in catastrophic forgetting of old tasks, limiting the model's ability to learn continually and to be extended to new speakers, dialects, languages, etc.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 24 Oct 2022 • Jinzi Qi, Hugo Van hamme
The scarcity of training data and the large speaker variation in dysarthric speech lead to poor accuracy and poor speaker generalization of spoken language understanding systems for dysarthric speech.
no code implementations • 18 Oct 2022 • Wim Boes, Hugo Van hamme
With regard to the accuracy measure, our best model achieved a score of 77. 1\% on the validation data, which is about the same as the performance obtained by the baseline system (77. 0\%).
no code implementations • 18 Oct 2022 • Wim Boes, Hugo Van hamme
This is significantly better than the performance obtained by the baseline model (0. 527), which can effectively be attributed to the changes that were applied to the pooling operations of the network.
no code implementations • 14 Oct 2022 • Jakob Poncelet, Hugo Van hamme
However, subtitles are not verbatim (i. e. exact) transcriptions of speech, so they cannot be used directly to improve an Automatic Speech Recognition (ASR) model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 1 Oct 2022 • Bastiaan Tamm, Helena Balabin, Rik Vandenberghe, Hugo Van hamme
Speech quality in online conferencing applications is typically assessed through human judgements in the form of the mean opinion score (MOS) metric.
no code implementations • 26 Sep 2022 • Wim Boes, Hugo Van hamme
Large-scale sound recognition data sets typically consist of acoustic recordings obtained from multimedia libraries.
no code implementations • 26 Sep 2022 • Wim Boes, Hugo Van hamme
Many state-of-the-art systems for audio tagging and sound event detection employ convolutional recurrent neural architectures.
no code implementations • 5 Jul 2022 • Corentin Puffay, Jana Van Canneyt, Jonas Vanthornhout, Hugo Van hamme, Tom Francart
To investigate how speech is processed in the brain, we can model the relation between features of a natural speech signal and the corresponding recorded electroencephalogram (EEG).
no code implementations • 1 Jul 2022 • Lies Bollens, Tom Francart, Hugo Van hamme
The electroencephalogram (EEG) is a powerful method to understand how the brain processes speech.
no code implementations • 28 Jun 2022 • Pu Wang, Hugo Van hamme
End-to-end spoken language understanding (SLU) systems benefit from pretraining on large corpora, followed by fine-tuning on application-specific data.
1 code implementation • 30 Mar 2022 • Steven Vander Eeckt, Hugo Van hamme
In this paper, we aim to overcome CF for E2E ASR by inserting adapters, small architectures of few parameters which allow a general model to be fine-tuned to a specific task, into our model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 17 Dec 2021 • Steven Vander Eeckt, Hugo Van hamme
Adapting Automatic Speech Recognition (ASR) models to new domains results in a deterioration of performance on the original domain(s), a phenomenon called Catastrophic Forgetting (CF).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 Sep 2021 • Jakob Poncelet, Hugo Van hamme
Recent research in speech processing exhibits a growing interest in unsupervised and self-supervised representation learning from unlabelled data to alleviate the need for large amounts of annotated data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
2 code implementations • 17 Jun 2021 • Mohammad Jalilpour Monesi, Bernd Accou, Tom Francart, Hugo Van hamme
Decoding the speech signal that a person is listening to from the human brain via electroencephalography (EEG) can help us understand how our auditory system works.
no code implementations • 16 Jun 2021 • Wim Boes, Robbe Van Rompaey, Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq
We inspect the long-term learning ability of Long Short-Term Memory language models (LSTM LMs) by evaluating a contextual extension based on the Continuous Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs and by analyzing its performance.
no code implementations • 15 Jun 2021 • Pu Wang, Bagher BabaAli, Hugo Van hamme
The acoustic model is pre-trained in two stages: initialization with a corpus of normal speech and finetuning on a mixture of dysarthric and normal speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 14 Jun 2021 • Jinzi Qi, Hugo Van hamme
We show that an extension of the FHVAE model succeeds in the better disentanglement of the content-related and sequence-related related representations, but both representations are still required for best results on disorder type classification.
no code implementations • 9 Jun 2021 • Wim Boes, Hugo Van hamme
We study the merit of transfer learning for two sound recognition problems, i. e., audio tagging and sound event detection.
no code implementations • 14 May 2021 • Bernd Accou, Mohammad Jalilpour Monesi, Hugo Van hamme, Tom Francart
The accuracy of the model's match/mismatch predictions can be used as a proxy for speech intelligibility without subject-specific (re)training.
no code implementations • 30 Mar 2021 • Pu Wang, Hugo Van hamme
In this paper we combine the encoder of an end-to-end ASR system with the prior NMF/capsule network-based user-taught decoder, and investigate whether pre-training methodology can reduce training data requirements for the NMF and capsule network.
1 code implementation • 19 Dec 2019 • Jeroen Zegers, Hugo Van hamme
In this paper we propose a novel network for source separation using an encoder-decoder CNN and LSTM in parallel.
no code implementations • 19 Dec 2019 • Pieter Appeltans, Jeroen Zegers, Hugo Van hamme
This paper examines the applicability in realistic scenarios of two deep learning based solutions to the overlapping speaker separation problem.
no code implementations • 2 Dec 2019 • Wim Boes, Hugo Van hamme
We tackle the task of environmental event classification by drawing inspiration from the transformer neural network architecture used in machine translation.
1 code implementation • 30 Jan 2019 • Janneke van de Loo, Jort F. Gemmeke, Guy De Pauw, Bart Ons, Walter Daelemans, Hugo Van hamme
We present a framework for the induction of semantic frames from utterances in the context of an adaptive command-and-control interface.
no code implementations • 24 Sep 2018 • Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq
Neural cache language models (LMs) extend the idea of regular cache language models by making the cache probability dependent on the similarity between the current context and the context of the words in the cache.
1 code implementation • 24 Aug 2018 • Jeroen Zegers, Hugo Van hamme
Furthermore, it is concluded that a single model, trained on different scenarios is capable of matching performance of scenario specific models.
1 code implementation • 24 Aug 2018 • Jeroen Zegers, Hugo Van hamme
With deep learning approaches becoming state-of-the-art in many speech (as well as non-speech) related machine learning tasks, efforts are being taken to delve into the neural networks which are often considered as a black box.
no code implementations • WS 2018 • Lyan Verwimp, Hugo Van hamme, Vincent Renkens, Patrick Wambacq
We present a framework for analyzing what the state in RNNs remembers from its input embeddings.
no code implementations • 12 Sep 2017 • Lyan Verwimp, Joris Pelemans, Marieke Lycke, Hugo Van hamme, Patrick Wambacq
One model is trained on all available data (46M word tokens), but we also trained models on a specific type of TV show or domain/topic.
no code implementations • EACL 2017 • Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq
We present a Character-Word Long Short-Term Memory Language Model which both reduces the perplexity with respect to a baseline word-level language model and reduces the number of parameters of the model.
1 code implementation • LREC 2016 • Joris Pelemans, Lyan Verwimp, Kris Demuynck, Hugo Van hamme, Patrick Wambacq
In this paper we present SCALE, a new Python toolkit that contains two extensions to n-gram language models.
no code implementations • 29 Apr 2016 • Jeroen Zegers, Hugo Van hamme
It is shown how state-of-the-art multichannel NMF for blind source separation can be easily extended to incorporate speaker recognition.
no code implementations • LREC 2014 • Joris Pelemans, Kris Demuynck, Hugo Van hamme, Patrick Wambacq
In this paper we present 3 applications in the domain of Automatic Speech Recognition for Dutch, all of which are developed using our in-house speech recognition toolkit SPRAAK.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1