1 code implementation • 25 Oct 2023 • Luca Della Libera, Pooneh Mousavi, Salah Zaiem, Cem Subakan, Mirco Ravanelli
To the best of our knowledge, CL-MASR is the first continual learning benchmark for the multilingual ASR task.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 19 Oct 2023 • Francesco Paissan, Zhepei Wang, Mirco Ravanelli, Paris Smaragdis, Cem Subakan
We show that the proposed editing pipeline is able to create audio edits that remain faithful to the input audio.
1 code implementation • 29 May 2023 • Juan Zuluaga-Gomez, Sara Ahmed, Danielius Visockas, Cem Subakan
We introduce a simple-to-follow recipe aligned to the SpeechBrain toolkit for accent classification based on Common Voice 7. 0 (English) and Common Voice 11. 0 (Italian, German, and Spanish).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 3 May 2023 • Zhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fabio Ayres, Paris Smaragdis
In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio.
no code implementations • 2 May 2023 • Arsenii Gorin, Cem Subakan, Sajjad Abdoli, Junhao Wang, Samantha Latremouille, Charles Onu
In this paper, we explore self-supervised learning (SSL) for analyzing a first-of-its-kind database of cry recordings containing clinical indications of more than a thousand newborns.
2 code implementations • 1 May 2023 • David Budaghyan, Charles C. Onu, Arsenii Gorin, Cem Subakan, Doina Precup
This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is a public speaker verification challenge based on cry sounds.
no code implementations • 22 Mar 2023 • Francesco Paissan, Cem Subakan, Mirco Ravanelli
In this paper, we introduce a new approach, called Posthoc Interpretation via Quantization (PIQ), for interpreting decisions made by trained classifiers.
1 code implementation • 19 Jun 2022 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Frédéric Lepoutre, François Grondin
Transformers have recently achieved state-of-the-art performance in speech separation.
1 code implementation • 15 May 2022 • Zhepei Wang, Cem Subakan, Xilin Jiang, Junkai Wu, Efthymios Tzinis, Mirco Ravanelli, Paris Smaragdis
In this paper, we work on a sound recognition system that continually incorporates new sound classes.
1 code implementation • 6 Feb 2022 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Francois Grondin, Mirko Bronzi
In particular, we extend our previous findings on the SepFormer by providing results on more challenging noisy and noisy-reverberant datasets, such as LibriMix, WHAM!, and WHAMR!.
Ranked #1 on
Speech Enhancement
on WHAM!
1 code implementation • 20 Oct 2021 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, François Grondin
First, we release the REAL-M dataset, a crowd-sourced corpus of real-life mixtures.
4 code implementations • 8 Jun 2021 • Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, Ju-chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Chien-Feng Liao, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato de Mori, Yoshua Bengio
SpeechBrain is an open-source and all-in-one speech toolkit.
3 code implementations • 25 Oct 2020 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Mirko Bronzi, Jianyuan Zhong
Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism.
Ranked #5 on
Speech Separation
on WSJ0-2mix
2 code implementations • 22 Oct 2019 • Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Cem Subakan, Paris Smaragdis
In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.
Ranked #21 on
Speech Separation
on WSJ0-2mix
no code implementations • 3 Jun 2019 • Zhepei Wang, Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin
We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data.
no code implementations • 12 Mar 2018 • Cem Subakan, Oluwasanmi Koyejo, Paris Smaragdis
Popular generative model learning methods such as Generative Adversarial Networks (GANs), and Variational Autoencoders (VAE) enforce the latent representation to follow simple distributions such as isotropic Gaussian.
1 code implementation • 30 Oct 2017 • Cem Subakan, Paris Smaragdis
Generative source separation methods such as non-negative matrix factorization (NMF) or auto-encoders, rely on the assumption of an output probability density.
no code implementations • NeurIPS 2014 • Cem Subakan, Johannes Traa, Paris Smaragdis
In this paper, we propose a learning approach for the Mixture of Hidden Markov Models (MHMM) based on the Method of Moments (MoM).