Search Results for author: Ricard Marxer

Found 19 papers, 8 papers with code

Transfer Learning from Whisper for Microscopic Intelligibility Prediction

no code implementations2 Apr 2024 Paul Best, Santiago Cuervo, Ricard Marxer

Macroscopic intelligibility models predict the expected human word-error-rate for a given speech-in-noise stimulus.

Automatic Speech Recognition speech-recognition +2

Scaling Properties of Speech Language Models

no code implementations31 Mar 2024 Santiago Cuervo, Ricard Marxer

We establish a strong correlation between pre-training loss and downstream syntactic and semantic performance in SLMs and LLMs, which results in predictable scaling of linguistic performance.

Speech foundation models on intelligibility prediction for hearing-impaired listeners

no code implementations24 Jan 2024 Santiago Cuervo, Ricard Marxer

Our method resulted in the winning submission in the CPC2, demonstrating its promise for speech perception applications.

SUCRe: Leveraging Scene Structure for Underwater Color Restoration

1 code implementation18 Dec 2022 Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Maxime Ferrera, Vincent Hugel

Underwater images are altered by the physical characteristics of the medium through which light rays pass before reaching the optical sensor.

Variable-rate hierarchical CPC leads to acoustic unit discovery in speech

1 code implementation5 Jun 2022 Santiago Cuervo, Adrian Łańcucki, Ricard Marxer, Paweł Rychlikowski, Jan Chorowski

The success of deep learning comes from its ability to capture the hierarchical structure of data by learning high-level representations defined in terms of low-level ones.

Acoustic Unit Discovery Disentanglement +4

Homography-Based Loss Function for Camera Pose Regression

1 code implementation4 May 2022 Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Vincent Hugel

This paper focuses on the loss functions that embed the error between two poses to perform deep learning based camera pose regression.

regression

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words

1 code implementation29 Oct 2021 Santiago Cuervo, Maciej Grabias, Jan Chorowski, Grzegorz Ciesielski, Adrian Łańcucki, Paweł Rychlikowski, Ricard Marxer

We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC).

Segmentation Self-Supervised Learning

Aligned Contrastive Predictive Coding

1 code implementation24 Apr 2021 Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski

We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations.

A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning

no code implementations3 Jun 2020 Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James Glass

Probabilistic Latent Variable Models (LVMs) provide an alternative to self-supervised learning approaches for linguistic representation learning from speech.

Representation Learning Self-Supervised Learning +1

Deep Learning Classification With Noisy Labels

no code implementations23 Apr 2020 Guillaume Sanchez, Vincente Guis, Ricard Marxer, Frédéric Bouchara

Deep Learning systems have shown tremendous accuracy in image classification, at the cost of big image datasets.

Classification Face Recognition +3

DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation

no code implementations31 Jul 2018 Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain

The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.

Speech Separation

Unsupervised Incremental Learning and Prediction of Music Signals

no code implementations2 Feb 2015 Ricard Marxer, Hendrik Purwins

A system is presented that segments, clusters and predicts musical audio in an unsupervised manner, adjusting the number of (timbre) clusters instantaneously to the audio input.

Clustering Incremental Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.