Search Results for author: Ricard Marxer

Found 19 papers, 8 papers with code

Transfer Learning from Whisper for Microscopic Intelligibility Prediction

no code implementations • 2 Apr 2024 • Paul Best, Santiago Cuervo, Ricard Marxer

Macroscopic intelligibility models predict the expected human word-error-rate for a given speech-in-noise stimulus.

Automatic Speech Recognition speech-recognition +2

Paper
Add Code

Scaling Properties of Speech Language Models

no code implementations • 31 Mar 2024 • Santiago Cuervo, Ricard Marxer

We establish a strong correlation between pre-training loss and downstream syntactic and semantic performance in SLMs and LLMs, which results in predictable scaling of linguistic performance.

Paper
Add Code

PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings

no code implementations • 4 Mar 2024 • Joonas Kalda, Clément Pagés, Ricard Marxer, Tanel Alumäe, Hervé Bredin

A major drawback of supervised speech separation (SSep) systems is their reliance on synthetic data, leading to poor real-world generalization.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Speech foundation models on intelligibility prediction for hearing-impaired listeners

no code implementations • 24 Jan 2024 • Santiago Cuervo, Ricard Marxer

Our method resulted in the winning submission in the CPC2, demonstrating its promise for speech perception applications.

Paper
Add Code

Eiffel Tower: A Deep-Sea Underwater Dataset for Long-Term Visual Localization

1 code implementation • 9 May 2023 • Clémentin Boittiaux, Claire Dune, Maxime Ferrera, Aurélien Arnaubec, Ricard Marxer, Marjolaine Matabos, Loïc Van Audenhaege, Vincent Hugel

This paper presents a new deep-sea dataset to benchmark underwater long-term visual localization.

Visual Localization

Paper
Code

SUCRe: Leveraging Scene Structure for Underwater Color Restoration

1 code implementation • 18 Dec 2022 • Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Maxime Ferrera, Vincent Hugel

Underwater images are altered by the physical characteristics of the medium through which light rays pass before reaching the optical sensor.

Paper
Code

Variable-rate hierarchical CPC leads to acoustic unit discovery in speech

1 code implementation • 5 Jun 2022 • Santiago Cuervo, Adrian Łańcucki, Ricard Marxer, Paweł Rychlikowski, Jan Chorowski

The success of deep learning comes from its ability to capture the hierarchical structure of data by learning high-level representations defined in terms of low-level ones.

Acoustic Unit Discovery Disentanglement +4

Paper
Code

Homography-Based Loss Function for Camera Pose Regression

1 code implementation • 4 May 2022 • Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Vincent Hugel

This paper focuses on the loss functions that embed the error between two poses to perform deep learning based camera pose regression.

regression

Paper
Code

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words

1 code implementation • 29 Oct 2021 • Santiago Cuervo, Maciej Grabias, Jan Chorowski, Grzegorz Ciesielski, Adrian Łańcucki, Paweł Rychlikowski, Ricard Marxer

We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC).

Segmentation Self-Supervised Learning

Paper
Code

Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw

1 code implementation • 22 Jun 2021 • Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski

We present a number of low-resource approaches to the tasks of the Zero Resource Speech Challenge 2021.

Information Retrieval Retrieval

Paper
Code

Aligned Contrastive Predictive Coding

1 code implementation • 24 Apr 2021 • Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski

We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations.

Paper
Code

A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning

no code implementations • 3 Jun 2020 • Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James Glass

Probabilistic Latent Variable Models (LVMs) provide an alternative to self-supervised learning approaches for linguistic representation learning from speech.

Representation Learning Self-Supervised Learning +1

Paper
Add Code

Robust Training of Vector Quantized Bottleneck Models

1 code implementation • 18 May 2020 • Adrian Łańcucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, Hans J. G. A. Dolfing, Sameer Khurana, Tanel Alumäe, Antoine Laurent

We show that the codebook learning can suffer from poor initialization and non-stationarity of clustered encoder outputs.

Clustering Disentanglement +1

Paper
Code

Deep Learning Classification With Noisy Labels

no code implementations • 23 Apr 2020 • Guillaume Sanchez, Vincente Guis, Ricard Marxer, Frédéric Bouchara

Deep Learning systems have shown tremendous accuracy in image classification, at the cost of big image datasets.

Classification Face Recognition +3

Paper
Add Code

DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation

no code implementations • 31 Jul 2018 • Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain

The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.

Speech Separation

Paper
Add Code

Knowledge transfer between speakers for personalised dialogue management

no code implementations • WS 2015 • I{\~n}igo Casanueva, Thomas Hain, Heidi Christensen, Ricard Marxer, Phil Green

Dialogue Management Management +1

Paper
Add Code

Remote Speech Technology for Speech Professionals - the CloudCAST initiative

no code implementations • WS 2015 • Phil Green, Ricard Marxer, Stuart Cunningham, Heidi Christensen, Frank Rudzicz, Maria Yancheva, Andr{\'e} Coy, Massimuliano Malavasi, Lorenzo Desideri

Speech Recognition

Paper
Add Code

Automatic dysfluency detection in dysarthric speech using deep belief networks

no code implementations • WS 2015 • Stacey Oue, Ricard Marxer, Frank Rudzicz

Speech Recognition

Paper
Add Code

Unsupervised Incremental Learning and Prediction of Music Signals

no code implementations • 2 Feb 2015 • Ricard Marxer, Hendrik Purwins

A system is presented that segments, clusters and predicts musical audio in an unsupervised manner, adjusting the number of (timbre) clusters instantaneously to the audio input.

Clustering Incremental Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.