TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Speech Enhancement	spatialized DNS challenge	DeFT-AN	SI-SDR	9.9	# 1
Speech Enhancement	spatialized DNS challenge	DeFT-AN	PESQ	3.01	# 1
Speech Enhancement	spatialized DNS challenge	DeFT-AN	STOI	0.924	# 1
Speech Dereverberation	spatialized WSJCAM0	DeFT-AN	SI-SDR	15.7	# 1
Speech Dereverberation	spatialized WSJCAM0	DeFT-AN	PESQ	3.63	# 1
Speech Dereverberation	spatialized WSJCAM0	DeFT-AN	STOI	0.981	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deft-an-dense-frequency-time-attentive/speech-enhancement-on-spatialized-dns)](https://paperswithcode.com/sota/speech-enhancement-on-spatialized-dns?p=deft-an-dense-frequency-time-attentive)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deft-an-dense-frequency-time-attentive/speech-dereverberation-on-spatialized-wsjcam0)](https://paperswithcode.com/sota/speech-dereverberation-on-spatialized-wsjcam0?p=deft-an-dense-frequency-time-attentive)`

DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

15 Dec 2022 · Dongheon Lee, Jung-Woo Choi ·

In this study, we propose a dense frequency-time attentive network (DeFT-AN) for multichannel speech enhancement. DeFT-AN is a mask estimation network that predicts a complex spectral masking pattern for suppressing the noise and reverberation embedded in the short-time Fourier transform (STFT) of an input signal. The proposed mask estimation network incorporates three different types of blocks for aggregating information in the spatial, spectral, and temporal dimensions. It utilizes a spectral transformer with a modified feed-forward network and a temporal conformer with sequential dilated convolutions. The use of dense blocks and transformers dedicated to the three different characteristics of audio signals enables more comprehensive enhancement in noisy and reverberant environments. The remarkable performance of DeFT-AN over state-of-the-art multichannel models is demonstrated based on two popular noisy and reverberant datasets in terms of various metrics for speech quality and intelligibility.

PDF Abstract

Code

Add Remove Mark official

donghoney0416/DeFT-AN

Tasks

Add Remove

Denoising

Speech Dereverberation

Speech Enhancement

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Ranked #1 on Speech Enhancement on spatialized DNS challenge

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Speech Enhancement	spatialized DNS challenge	DeFT-AN	SI-SDR	9.9	# 1	Compare
			PESQ	3.01	# 1	Compare
			STOI	0.924	# 1	Compare
Speech Dereverberation	spatialized WSJCAM0	DeFT-AN	SI-SDR	15.7	# 1	Compare
			PESQ	3.63	# 1	Compare
			STOI	0.981	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove