Search Results for author: Cem Subakan

Found 18 papers, 12 papers with code

Audio Editing with Non-Rigid Text Prompts

no code implementations19 Oct 2023 Francesco Paissan, Zhepei Wang, Mirco Ravanelli, Paris Smaragdis, Cem Subakan

We show that the proposed editing pipeline is able to create audio edits that remain faithful to the input audio.

Audio Generation Style Transfer

CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

1 code implementation29 May 2023 Juan Zuluaga-Gomez, Sara Ahmed, Danielius Visockas, Cem Subakan

We introduce a simple-to-follow recipe aligned to the SpeechBrain toolkit for accent classification based on Common Voice 7. 0 (English) and Common Voice 11. 0 (Italian, German, and Spanish).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Unsupervised Improvement of Audio-Text Cross-Modal Representations

1 code implementation3 May 2023 Zhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fabio Ayres, Paris Smaragdis

In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio.

Acoustic Scene Classification Classification +2

Self-supervised learning for infant cry analysis

no code implementations2 May 2023 Arsenii Gorin, Cem Subakan, Sajjad Abdoli, Junhao Wang, Samantha Latremouille, Charles Onu

In this paper, we explore self-supervised learning (SSL) for analyzing a first-of-its-kind database of cry recordings containing clinical indications of more than a thousand newborns.

Domain Adaptation Self-Supervised Learning

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

2 code implementations1 May 2023 David Budaghyan, Charles C. Onu, Arsenii Gorin, Cem Subakan, Doina Precup

This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is a public speaker verification challenge based on cry sounds.

Speaker Verification

Posthoc Interpretation via Quantization

no code implementations22 Mar 2023 Francesco Paissan, Cem Subakan, Mirco Ravanelli

In this paper, we introduce a new approach, called Posthoc Interpretation via Quantization (PIQ), for interpreting decisions made by trained classifiers.

Image Segmentation Quantization +1

Resource-Efficient Separation Transformer

1 code implementation19 Jun 2022 Cem Subakan, Mirco Ravanelli, Samuele Cornell, Frédéric Lepoutre, François Grondin

Transformers have recently achieved state-of-the-art performance in speech separation.

Speech Separation

Exploring Self-Attention Mechanisms for Speech Separation

1 code implementation6 Feb 2022 Cem Subakan, Mirco Ravanelli, Samuele Cornell, Francois Grondin, Mirko Bronzi

In particular, we extend our previous findings on the SepFormer by providing results on more challenging noisy and noisy-reverberant datasets, such as LibriMix, WHAM!, and WHAMR!.

Denoising Speech Enhancement +1

Attention is All You Need in Speech Separation

3 code implementations25 Oct 2020 Cem Subakan, Mirco Ravanelli, Samuele Cornell, Mirko Bronzi, Jianyuan Zhong

Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism.

Speech Separation

Two-Step Sound Source Separation: Training on Learned Latent Targets

2 code implementations22 Oct 2019 Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Cem Subakan, Paris Smaragdis

In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.

Speech Separation Vocal Bursts Valence Prediction

Continual Learning of New Sound Classes using Generative Replay

no code implementations3 Jun 2019 Zhepei Wang, Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin

We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data.

Continual Learning Sound Classification

Learning the Base Distribution in Implicit Generative Models

no code implementations12 Mar 2018 Cem Subakan, Oluwasanmi Koyejo, Paris Smaragdis

Popular generative model learning methods such as Generative Adversarial Networks (GANs), and Variational Autoencoders (VAE) enforce the latent representation to follow simple distributions such as isotropic Gaussian.

Generative Adversarial Source Separation

1 code implementation30 Oct 2017 Cem Subakan, Paris Smaragdis

Generative source separation methods such as non-negative matrix factorization (NMF) or auto-encoders, rely on the assumption of an output probability density.

Spectral Learning of Mixture of Hidden Markov Models

no code implementations NeurIPS 2014 Cem Subakan, Johannes Traa, Paris Smaragdis

In this paper, we propose a learning approach for the Mixture of Hidden Markov Models (MHMM) based on the Method of Moments (MoM).

Cannot find the paper you are looking for? You can Submit a new open access paper.