1 code implementation • 12 Oct 2022 • Leanne Nortje, Herman Kamper
We formalise this task and call it visually prompted keyword localisation (VPKL): given an image of a keyword, detect and predict where in an utterance the keyword occurs.
1 code implementation • 2 Aug 2021 • Benjamin van Niekerk, Leanne Nortje, Matthew Baas, Herman Kamper
In this paper, we first show that the per-utterance mean of CPC features captures speaker information to a large extent.
1 code implementation • 10 Dec 2020 • Leanne Nortje, Herman Kamper
We propose direct multimodal few-shot models that learn a shared embedding space of spoken words and images from only a few paired examples.
1 code implementation • 14 Aug 2020 • Leanne Nortje, Herman Kamper
Here we compare transfer learning to unsupervised models trained on unlabelled in-domain data.
2 code implementations • 19 May 2020 • Benjamin van Niekerk, Leanne Nortje, Herman Kamper
The idea is to learn a representation of speech by predicting future acoustic units.
Ranked #1 on
Acoustic Unit Discovery
on ZeroSpeech 2019 English
no code implementations • 16 Apr 2019 • Ryan Eloff, André Nortje, Benjamin van Niekerk, Avashna Govender, Leanne Nortje, Arnu Pretorius, Elan van Biljon, Ewald van der Westhuizen, Lisa van Staden, Herman Kamper
For our submission to the ZeroSpeech 2019 challenge, we apply discrete latent-variable neural networks to unlabelled speech and use the discovered units for speech synthesis.