no code implementations • LREC 2022 • Julian Linke, Philip N. Garner, Gernot Kubin, Barbara Schuppler
Conversational speech represents one of the most complex of automatic speech recognition (ASR) tasks owing to the high inter-speaker variation in both pronunciation and conversational dynamics.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • IWSLT 2016 • Alexandros Lazaridis, Ivan Himawan, Petr Motlicek, Iosif Mporas, Philip N. Garner
We experiment with three different scenarios using, i) French, as a source language uncorrelated to the target language, ii) Ukrainian, as a source language correlated to the target one and finally iii) English as a source language uncorrelated to the target language using a relatively large amount of data in respect to the other two scenarios.
no code implementations • 9 Oct 2024 • Mutian He, Philip N. Garner
In a series of empirical studies on language processing, language modeling, and speech processing, we show that CALD can effectively recover the result of the original model, and that the guiding strategy contributes to the result.
1 code implementation • 16 Sep 2024 • Haolin Chen, Philip N. Garner
Furthermore, our findings suggest that the magnitude, rather than the variance, is the primary indicator of the importance of parameters.
no code implementations • 9 Sep 2024 • Louise Coppieters de Gibson, Philip N. Garner, Pierre-Edouard Honnet
Whilst state of the art automatic speech recognition (ASR) can perform well, it still degrades when exposed to acoustic environments that differ from those used when training the model.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
1 code implementation • 22 Apr 2024 • Alexandre Bittar, Philip N. Garner
Understanding cognitive processes in the brain demands sophisticated models capable of replicating neural dynamics at large scales.
1 code implementation • 19 Feb 2024 • Haolin Chen, Philip N. Garner
Our results demonstrate that catastrophic forgetting can be overcome by our methods without degrading the fine-tuning performance, and using the Kronecker-factored approximation produces a better preservation of the pre-training knowledge than the diagonal ones.
no code implementations • 29 Nov 2023 • Pavel Korshunov, Haolin Chen, Philip N. Garner, Sebastien Marcel
From the publicly available speech dataset LibriTTS, we also created a separate database of only audio deepfakes LibriTTS-DF using several latest text to speech methods: YourTTS, Adaspeech, and TorToiSe.
no code implementations • 22 May 2023 • Mutian He, Philip N. Garner
Recently, large pretrained language models have demonstrated strong language understanding capabilities.
1 code implementation • 16 May 2023 • Mutian He, Philip N. Garner
Motivated particularly by the task of cross-lingual SLU, we demonstrate that the task of speech translation (ST) is a good means of pretraining speech models for end-to-end SLU on both intra- and cross-lingual scenarios.
no code implementations • 3 Mar 2023 • Haolin Chen, Philip N. Garner
Given the recent success of diffusion in producing natural-sounding synthetic speech, we investigate how diffusion can be used in speaker adaptive TTS.
no code implementations • 1 Dec 2022 • Alexandre Bittar, Philip N. Garner
Compared to conventional artificial neurons that produce dense and real-valued responses, biologically-inspired spiking neurons transmit sparse and binary information, which can also lead to energy-efficient implementations.
no code implementations • 22 Aug 2022 • Louise Coppieters de Gibson, Philip N. Garner
We investigate whether the inference can be inverted to provide insights into that biological system; in particular the hearing mechanism.
1 code implementation • Frontiers in Neuroscience 2022 • Alexandre Bittar, Philip N. Garner
Artificial neural networks (ANNs) are the basis of recent advances in artificial intelligence (AI); they typically use real valued neuron responses.
Ranked #3 on
Audio Classification
on SSC
1 code implementation • 21 Jul 2022 • Alexandre Bittar, Philip N. Garner
Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm.
1 code implementation • 9 Jun 2020 • Niccolò Antonello, Philip N. Garner
It is shown that classifiers that adopt this novel operator can be more robust to out of distribution samples, often outperforming NNs that use the standard softmax operator.
no code implementations • 24 Oct 2019 • Philip N. Garner, Sibo Tong
We show that introduction of a context indicator leads to a variable feedback that is similar to the forget mechanism in conventional recurrent units.
1 code implementation • 22 Jun 2018 • Branislav Gerazov, Gérard Bailly, Omar Mohammed, Yi Xu, Philip N. Garner
Our work bridges between a comprehensive generative model of intonation and state-of-the-art AI techniques.
no code implementations • EACL 2017 • Renars Liepins, Ulrich Germann, Guntis Barzdins, Alex Birch, ra, Steve Renals, Susanne Weber, Peggy van der Kreeft, Herv{\'e} Bourlard, Jo{\~a}o Prieto, Ond{\v{r}}ej Klejch, Peter Bell, Alex Lazaridis, ros, Alfonso Mendes, Sebastian Riedel, Mariana S. C. Almeida, Pedro Balage, Shay B. Cohen, Tomasz Dwojak, Philip N. Garner, Andreas Giefer, Marcin Junczys-Dowmunt, Hina Imran, David Nogueira, Ahmed Ali, Mir, Sebasti{\~a}o a, Andrei Popescu-Belis, Lesly Miculicich Werlen, Nikos Papasarantopoulos, Abiola Obamuyide, Clive Jones, Fahim Dalvi, Andreas Vlachos, Yang Wang, Sibo Tong, Rico Sennrich, Nikolaos Pappas, Shashi Narayan, Marco Damonte, Nadir Durrani, Sameer Khurana, Ahmed Abdelali, Hassan Sajjad, Stephan Vogel, David Sheppey, Chris Hernon, Jeff Mitchell
We present the first prototype of the SUMMA Platform: an integrated platform for multilingual media monitoring.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 15 Apr 2016 • Milos Cernak, Alexandros Lazaridis, Afsaneh Asaei, Philip N. Garner
Segmental errors are further propagated to optional suprasegmental (such as syllable) information coding.
no code implementations • 31 Aug 2014 • Mohammad J. Taghizadeh, Reza Parhizkar, Philip N. Garner, Herve Bourlard, Afsaneh Asaei
This paper addresses the problem of ad hoc microphone array calibration where only partial information about the distances between microphones is available.