Speech Extraction

7 papers with code • 1 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Speech Extraction

Trend	Dataset	Best Model	Paper	Code	Compare
	WSJ0-2mix-extr	SpEx+ (tied)			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

jyhan03/channel-decorrelation • • 23 Jan 2020

First, we propose a time-domain implementation of SpeakerBeam similar to that proposed for a time-domain audio separation network (TasNet), which has achieved state-of-the-art performance for speech separation.

Paper
Code

Target Speech Extraction Based on Blind Source Separation and X-vector-based Speaker Selection Trained with Data Augmentation

annie-gu/MVAEBasedBSE • 16 May 2020

Extracting the desired speech from a mixture is a meaningful and challenging task.

Paper
Code

Multi-channel target speech extraction with channel decorrelation and target speaker adaptation

jyhan03/channel-decorrelation • • 19 Oct 2020

In this work, we propose two methods for exploiting the multi-channel spatial information to extract the target speech.

Paper
Code

DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction

jyhan03/icassp22-dataset • 27 Dec 2021

Particularly, we find that the Mixture-Remix fine-tuning with DPCCN significantly outperforms the TD-SpeakerBeam for unsupervised cross-domain TSE, with around 3. 5 dB SISNR improvement on target domain test set, without any source domain performance degradation.

Paper
Code

On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement

sp-uhh/deep-non-linear-filter • • 22 Jun 2022

Employing deep neural networks (DNNs) to directly learn filters for multi-channel speech enhancement has potentially two key advantages over a traditional approach combining a linear spatial filter with an independent tempo-spectral post-filter: 1) non-linear spatial filtering allows to overcome potential restrictions originating from a linear processing model and 2) joint processing of spatial and tempo-spectral information allows to exploit interdependencies between different sources of information.

Paper
Code

Analysis of impact of emotions on target speech extraction and speech separation

butspeechfit/ravdess2mix • • 15 Aug 2022

One of the factors causing such degradation may be intrinsic speaker variability, such as emotions, occurring commonly in realistic speech.

Paper
Code

Neural Target Speech Extraction: An Overview

butspeechfit/speakerbeam • • 31 Jan 2023

Humans can listen to a target speaker even in challenging acoustic conditions that have noise, reverberation, and interfering speakers.

Paper
Code

Speech Extraction

Benchmarks Add a Result

Datasets

Most implemented papers

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

Target Speech Extraction Based on Blind Source Separation and X-vector-based Speaker Selection Trained with Data Augmentation

Multi-channel target speech extraction with channel decorrelation and target speaker adaptation

DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction

On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement

Analysis of impact of emotions on target speech extraction and speech separation

Neural Target Speech Extraction: An Overview

Content

Benchmarks

Add a Result