TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Speech Recognition	EasyCom	DAJA (MVDR,HMA,1000) (Overlapped Speech)	WER (%)	62.36	# 3
Speech Enhancement	EasyCom	DAJA (MVDR,HMA,1000) (Overlapped Speech)	SDR	-4.76	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/direction-aware-joint-adaptation-of-neural/speech-enhancement-on-easycom)](https://paperswithcode.com/sota/speech-enhancement-on-easycom?p=direction-aware-joint-adaptation-of-neural)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/direction-aware-joint-adaptation-of-neural/speech-recognition-on-easycom)](https://paperswithcode.com/sota/speech-recognition-on-easycom?p=direction-aware-joint-adaptation-of-neural)`

Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments

15 Jul 2022 · Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii ·

This paper describes noisy speech recognition for an augmented reality headset that helps verbal communication within real multiparty conversational environments. A major approach that has actively been studied in simulated environments is to sequentially perform speech enhancement and automatic speech recognition (ASR) based on deep neural networks (DNNs) trained in a supervised manner. In our task, however, such a pretrained system fails to work due to the mismatch between the training and test conditions and the head movements of the user. To enhance only the utterances of a target speaker, we use beamforming based on a DNN-based speech mask estimator that can adaptively extract the speech components corresponding to a head-relative particular direction. We propose a semi-supervised adaptation method that jointly updates the mask estimator and the ASR model at run-time using clean speech signals with ground-truth transcriptions and noisy speech signals with highly-confident estimated transcriptions. Comparative experiments using the state-of-the-art distant speech recognition system show that the proposed method significantly improves the ASR performance.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Automatic Speech Recognition

Automatic Speech Recognition (ASR)

Distant Speech Recognition

Noisy Speech Recognition

Speech Enhancement

speech-recognition

Speech Recognition

Datasets

LibriSpeech

Common Voice

EasyCom

Results from the Paper

Edit

Ranked #1 on Speech Enhancement on EasyCom (SDR metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Speech Recognition	EasyCom	DAJA (MVDR,HMA,1000) (Overlapped Speech)	WER (%)	62.36	# 3		Compare
Speech Enhancement	EasyCom	DAJA (MVDR,HMA,1000) (Overlapped Speech)	SDR	-4.76	# 1		Compare

Methods

Add Remove

Test

Edit Social Preview

Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove