In this work, we propose Exformer, a time-domain architecture for target speaker extraction.
In this paper, we present a self-supervised learning framework for continually learning representations for new sound classes.
Singing voice separation aims to separate music into vocals and accompaniment components.
We propose FEDENHANCE, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients.
Recent progress in audio source separation lead by deep learning has enabled many neural network models to provide robust solutions to this fundamental estimation problem.
Ranked #1 on Speech Separation on WHAMR!
Given a limited set of labeled data, we present a method to leverage a large volume of unlabeled data to improve the model's performance.
Gradient-based planners are widely used for quadrotor local planning, in which a Euclidean Signed Distance Field (ESDF) is crucial for evaluating gradient magnitude and direction.
In this paper, we present an efficient neural network for end-to-end general purpose audio source separation.
Ranked #6 on Speech Separation on WHAMR!
In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.
Ranked #16 on Speech Separation on WSJ0-2mix
We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data.