Search Results for author: Cem Subakan

Found 20 papers, 13 papers with code

Listenable Maps for Audio Classifiers

no code implementations • 19 Mar 2024 • Francesco Paissan, Mirco Ravanelli, Cem Subakan

Despite the impressive performance of deep learning models across diverse tasks, their complexity poses challenges for interpretation.

Paper
Add Code

Focal Modulation Networks for Interpretable Sound Classification

no code implementations • 5 Feb 2024 • Luca Della Libera, Cem Subakan, Mirco Ravanelli

The increasing success of deep neural networks has raised concerns about their inherent black-box nature, posing challenges related to interpretability and trust.

Classification Environmental Sound Classification +1

Paper
Add Code

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

1 code implementation • 25 Oct 2023 • Luca Della Libera, Pooneh Mousavi, Salah Zaiem, Cem Subakan, Mirco Ravanelli

To the best of our knowledge, CL-MASR is the first continual learning benchmark for the multilingual ASR task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Audio Editing with Non-Rigid Text Prompts

no code implementations • 19 Oct 2023 • Francesco Paissan, Zhepei Wang, Mirco Ravanelli, Paris Smaragdis, Cem Subakan

We show that the proposed editing pipeline is able to create audio edits that remain faithful to the input audio.

Audio Generation Style Transfer

Paper
Add Code

CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

1 code implementation • 29 May 2023 • Juan Zuluaga-Gomez, Sara Ahmed, Danielius Visockas, Cem Subakan

We introduce a simple-to-follow recipe aligned to the SpeechBrain toolkit for accent classification based on Common Voice 7. 0 (English) and Common Voice 11. 0 (Italian, German, and Spanish).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

7,869

Paper
Code

Unsupervised Improvement of Audio-Text Cross-Modal Representations

1 code implementation • 3 May 2023 • Zhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fabio Ayres, Paris Smaragdis

In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio.

Acoustic Scene Classification Classification +2

Paper
Code

Self-supervised learning for infant cry analysis

no code implementations • 2 May 2023 • Arsenii Gorin, Cem Subakan, Sajjad Abdoli, Junhao Wang, Samantha Latremouille, Charles Onu

In this paper, we explore self-supervised learning (SSL) for analyzing a first-of-its-kind database of cry recordings containing clinical indications of more than a thousand newborns.

Domain Adaptation Self-Supervised Learning

Paper
Add Code

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

2 code implementations • 1 May 2023 • David Budaghyan, Charles C. Onu, Arsenii Gorin, Cem Subakan, Doina Precup

This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is a public speaker verification challenge based on cry sounds.

Speaker Verification

Paper
Code

Posthoc Interpretation via Quantization

1 code implementation • 22 Mar 2023 • Francesco Paissan, Cem Subakan, Mirco Ravanelli

In this paper, we introduce a new approach, called Posthoc Interpretation via Quantization (PIQ), for interpreting decisions made by trained classifiers.

Image Segmentation Quantization +1

7,869

Paper
Code

Resource-Efficient Separation Transformer

1 code implementation • 19 Jun 2022 • Luca Della Libera, Cem Subakan, Mirco Ravanelli, Samuele Cornell, Frédéric Lepoutre, François Grondin

Transformers have recently achieved state-of-the-art performance in speech separation.

Speech Separation

7,869

Paper
Code

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

1 code implementation • 15 May 2022 • Zhepei Wang, Cem Subakan, Xilin Jiang, Junkai Wu, Efthymios Tzinis, Mirco Ravanelli, Paris Smaragdis

In this paper, we work on a sound recognition system that continually incorporates new sound classes.

Continual Learning Representation Learning +1

Paper
Code

Exploring Self-Attention Mechanisms for Speech Separation

1 code implementation • 6 Feb 2022 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Francois Grondin, Mirko Bronzi

In particular, we extend our previous findings on the SepFormer by providing results on more challenging noisy and noisy-reverberant datasets, such as LibriMix, WHAM!, and WHAMR!.

Ranked #1 on Speech Enhancement on WHAM!

Denoising Speech Enhancement +1

7,869

Paper
Code

REAL-M: Towards Speech Separation on Real Mixtures

1 code implementation • 20 Oct 2021 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, François Grondin

First, we release the REAL-M dataset, a crowd-sourced corpus of real-life mixtures.

Open-Ended Question Answering Speech Separation

7,867

Paper
Code

SpeechBrain: A General-Purpose Speech Toolkit

4 code implementations • 8 Jun 2021 • Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, Ju-chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Chien-Feng Liao, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato de Mori, Yoshua Bengio

SpeechBrain is an open-source and all-in-one speech toolkit.

Language Identification Spoken Language Understanding

7,869

Paper
Code

Attention is All You Need in Speech Separation

4 code implementations • 25 Oct 2020 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Mirko Bronzi, Jianyuan Zhong

Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism.

Ranked #7 on Speech Separation on WSJ0-3mix

Speech Separation

7,869

Paper
Code

Two-Step Sound Source Separation: Training on Learned Latent Targets

2 code implementations • 22 Oct 2019 • Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Cem Subakan, Paris Smaragdis

In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.

Ranked #25 on Speech Separation on WSJ0-2mix

Speech Separation Vocal Bursts Valence Prediction

2,105

Paper
Code

Continual Learning of New Sound Classes using Generative Replay

no code implementations • 3 Jun 2019 • Zhepei Wang, Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin

We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data.

Continual Learning Sound Classification

Paper
Add Code

Learning the Base Distribution in Implicit Generative Models

no code implementations • 12 Mar 2018 • Cem Subakan, Oluwasanmi Koyejo, Paris Smaragdis

Popular generative model learning methods such as Generative Adversarial Networks (GANs), and Variational Autoencoders (VAE) enforce the latent representation to follow simple distributions such as isotropic Gaussian.

Paper
Add Code

Generative Adversarial Source Separation

1 code implementation • 30 Oct 2017 • Cem Subakan, Paris Smaragdis

Generative source separation methods such as non-negative matrix factorization (NMF) or auto-encoders, rely on the assumption of an output probability density.

Paper
Code

Spectral Learning of Mixture of Hidden Markov Models

no code implementations • NeurIPS 2014 • Cem Subakan, Johannes Traa, Paris Smaragdis

In this paper, we propose a learning approach for the Mixture of Hidden Markov Models (MHMM) based on the Method of Moments (MoM).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.