Search Results for author: Morten Kolbæk

Found 5 papers, 2 papers with code

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement

no code implementations • 3 Sep 2019 • Morten Kolbæk, Zheng-Hua Tan, Søren Holdt Jensen, Jesper Jensen

Finally, we show that a loss function based on scale-invariant signal-to-distortion ratio (SI-SDR) achieves good general performance across a range of popular speech enhancement evaluation metrics, which suggests that SI-SDR is a good candidate as a general-purpose loss function for speech enhancement systems.

Speech Enhancement

Paper
Add Code

Monaural Speech Enhancement using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure

no code implementations • 2 Feb 2018 • Morten Kolbæk, Zheng-Hua Tan, Jesper Jensen

Finally, we show that the proposed SE system performs on par with a traditional DNN based Short-Time Spectral Amplitude (STSA) SE system in terms of estimated speech intelligibility.

Sound Audio and Speech Processing

Paper
Add Code

Joint Separation and Denoising of Noisy Multi-talker Speech using Recurrent Neural Networks and Permutation Invariant Training

no code implementations • 31 Aug 2017 • Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Jesper Jensen

We show that deep bi-directional LSTM RNNs trained using uPIT in noisy environments can improve the Signal-to-Distortion Ratio (SDR) as well as the Extended Short-Time Objective Intelligibility (ESTOI) measure, on the speaker independent multi-talker speech separation and denoising task, for various noise types and Signal-to-Noise Ratios (SNRs).

Sound

Paper
Add Code

Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks

3 code implementations • 18 Mar 2017 • Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Jesper Jensen

We evaluated uPIT on the WSJ0 and Danish two- and three-talker mixed-speech separation tasks and found that uPIT outperforms techniques based on Non-negative Matrix Factorization (NMF) and Computational Auditory Scene Analysis (CASA), and compares favorably with Deep Clustering (DPCL) and the Deep Attractor Network (DANet).

Clustering Deep Clustering +1

131

Paper
Code

Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation

1 code implementation • 1 Jul 2016 • Dong Yu, Morten Kolbæk, Zheng-Hua Tan, Jesper Jensen

We propose a novel deep learning model, which supports permutation invariant training (PIT), for speaker independent multi-talker speech separation, commonly known as the cocktail-party problem.

Clustering Deep Clustering +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.