1 code implementation • 4 May 2022 • Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Vincent Hugel
This paper focuses on the loss functions that embed the error between two poses to perform deep learning based camera pose regression.
1 code implementation • 29 Oct 2021 • Santiago Cuervo, Maciej Grabias, Jan Chorowski, Grzegorz Ciesielski, Adrian Łańcucki, Paweł Rychlikowski, Ricard Marxer
We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC).
1 code implementation • 22 Jun 2021 • Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski
We present a number of low-resource approaches to the tasks of the Zero Resource Speech Challenge 2021.
1 code implementation • 24 Apr 2021 • Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski
We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations.
no code implementations • 3 Jun 2020 • Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James Glass
Probabilistic Latent Variable Models (LVMs) provide an alternative to self-supervised learning approaches for linguistic representation learning from speech.
1 code implementation • 18 May 2020 • Adrian Łańcucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, Hans J. G. A. Dolfing, Sameer Khurana, Tanel Alumäe, Antoine Laurent
We show that the codebook learning can suffer from poor initialization and non-stationarity of clustered encoder outputs.
no code implementations • 23 Apr 2020 • Guillaume Sanchez, Vincente Guis, Ricard Marxer, Frédéric Bouchara
Deep Learning systems have shown tremendous accuracy in image classification, at the cost of big image datasets.
no code implementations • 31 Jul 2018 • Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain
The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.
no code implementations • 2 Feb 2015 • Ricard Marxer, Hendrik Purwins
A system is presented that segments, clusters and predicts musical audio in an unsupervised manner, adjusting the number of (timbre) clusters instantaneously to the audio input.