no code implementations • 23 Aug 2024 • Tiago Tavares, Fabio Ayres, Zhepei Wang, Paris Smaragdis
Recent advances in audio-text cross-modal contrastive learning have shown its potential towards zero-shot learning.
no code implementations • 19 Oct 2023 • Francesco Paissan, Luca Della Libera, Zhepei Wang, Mirco Ravanelli, Paris Smaragdis, Cem Subakan
We show that the proposed editing pipeline is able to create audio edits that remain faithful to the input audio.
1 code implementation • 3 May 2023 • Zhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fabio Ayres, Paris Smaragdis
In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio.
no code implementations • 23 Feb 2023 • Zhepei Wang, Ritwik Giri, Devansh Shah, Jean-Marc Valin, Michael M. Goodwin, Paris Smaragdis
In this study, we present an approach to train a single speech enhancement network that can perform both personalized and non-personalized speech enhancement.
no code implementations • 18 Jun 2022 • Zhepei Wang, Ritwik Giri, Shrikant Venkataramani, Umut Isik, Jean-Marc Valin, Paris Smaragdis, Mike Goodwin, Arvindh Krishnaswamy
In this work, we propose Exformer, a time-domain architecture for target speaker extraction.
1 code implementation • 15 May 2022 • Zhepei Wang, Cem Subakan, Xilin Jiang, Junkai Wu, Efthymios Tzinis, Mirco Ravanelli, Paris Smaragdis
In this paper, we work on a sound recognition system that continually incorporates new sound classes.
no code implementations • 28 Mar 2022 • Siyuan Yuan, Zhepei Wang, Umut Isik, Ritwik Giri, Jean-Marc Valin, Michael M. Goodwin, Arvindh Krishnaswamy
Singing voice separation aims to separate music into vocals and accompaniment components.
1 code implementation • 11 May 2021 • Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, Paris Smaragdis
We propose FEDENHANCE, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients.
3 code implementations • 3 Mar 2021 • Efthymios Tzinis, Zhepei Wang, Xilin Jiang, Paris Smaragdis
Recent progress in audio source separation lead by deep learning has enabled many neural network models to provide robust solutions to this fundamental estimation problem.
Ranked #5 on Speech Separation on WHAMR!
no code implementations • 16 Feb 2021 • Zhepei Wang, Ritwik Giri, Umut Isik, Jean-Marc Valin, Arvindh Krishnaswamy
Given a limited set of labeled data, we present a method to leverage a large volume of unlabeled data to improve the model's performance.
2 code implementations • 20 Aug 2020 • Xin Zhou, Zhepei Wang, Chao Xu, Fei Gao
Gradient-based planners are widely used for quadrotor local planning, in which a Euclidean Signed Distance Field (ESDF) is crucial for evaluating gradient magnitude and direction.
Robotics
4 code implementations • 14 Jul 2020 • Efthymios Tzinis, Zhepei Wang, Paris Smaragdis
In this paper, we present an efficient neural network for end-to-end general purpose audio source separation.
Ranked #11 on Speech Separation on WHAMR!
2 code implementations • 22 Oct 2019 • Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Cem Subakan, Paris Smaragdis
In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.
Ranked #26 on Speech Separation on WSJ0-2mix
no code implementations • 3 Jun 2019 • Zhepei Wang, Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin
We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data.