Search Results for author: Christoph Boeddeker

Found 16 papers, 7 papers with code

MMS-MSG: A Multi-purpose Multi-Speaker Mixture Signal Generator

1 code implementation23 Sep 2022 Tobias Cord-Landwehr, Thilo von Neumann, Christoph Boeddeker, Reinhold Haeb-Umbach

Training and evaluation of these single tasks requires synthetic data with access to intermediate signals that is as close as possible to the evaluation scenario.

Speech Enhancement

Utterance-by-utterance overlap-aware neural diarization with Graph-PIT

1 code implementation28 Jul 2022 Keisuke Kinoshita, Thilo von Neumann, Marc Delcroix, Christoph Boeddeker, Reinhold Haeb-Umbach

In this paper, we argue that such an approach involving the segmentation has several issues; for example, it inevitably faces a dilemma that larger segment sizes increase both the context available for enhancing the performance and the number of speakers for the local EEND module to handle.

speaker-diarization Speaker Diarization

A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network

no code implementations2 May 2022 Tobias Gburrek, Christoph Boeddeker, Thilo von Neumann, Tobias Cord-Landwehr, Joerg Schmalenstroeer, Reinhold Haeb-Umbach

We propose a system that transcribes the conversation of a typical meeting scenario that is captured by a set of initially unsynchronized microphone arrays at unknown positions.

Automatic Speech Recognition Speech Enhancement +1

Monaural source separation: From anechoic to reverberant environments

no code implementations15 Nov 2021 Tobias Cord-Landwehr, Christoph Boeddeker, Thilo von Neumann, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach

Impressive progress in neural network-based single-channel speech source separation has been made in recent years.

SA-SDR: A novel loss function for separation of meeting style data

no code implementations29 Oct 2021 Thilo von Neumann, Keisuke Kinoshita, Christoph Boeddeker, Marc Delcroix, Reinhold Haeb-Umbach

Many state-of-the-art neural network-based source separation systems use the averaged Signal-to-Distortion Ratio (SDR) as a training objective function.

Speeding Up Permutation Invariant Training for Source Separation

1 code implementation30 Jul 2021 Thilo von Neumann, Christoph Boeddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach

The Hungarian algorithm can be used for uPIT and we introduce various algorithms for the Graph-PIT assignment problem to reduce the complexity to be polynomial in the number of utterances.

Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers

1 code implementation30 Jul 2021 Thilo von Neumann, Keisuke Kinoshita, Christoph Boeddeker, Marc Delcroix, Reinhold Haeb-Umbach

When processing meeting-like data in a segment-wise manner, i. e., by separating overlapping segments independently and stitching adjacent segments to continuous output streams, this constraint has to be fulfilled for any segment.

Speech Separation

Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR

no code implementations4 Jun 2020 Thilo von Neumann, Christoph Boeddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Most approaches to multi-talker overlapped speech separation and recognition assume that the number of simultaneously active speakers is given, but in realistic situations, it is typically unknown.

Automatic Speech Recognition Speech Extraction +1

End-to-end training of time domain audio separation and recognition

no code implementations18 Dec 2019 Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Boeddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

The rising interest in single-channel multi-speaker speech separation sparked development of End-to-End (E2E) approaches to multi-speaker speech recognition.

Speaker Recognition speech-recognition +2

Demystifying TasNet: A Dissecting Approach

no code implementations20 Nov 2019 Jens Heitkaemper, Darius Jakobeit, Christoph Boeddeker, Lukas Drude, Reinhold Haeb-Umbach

In recent years time domain speech separation has excelled over frequency domain separation in single channel scenarios and noise-free environments.

Speech Separation

SMS-WSJ: Database, performance measures, and baseline recipe for multi-channel source separation and recognition

3 code implementations30 Oct 2019 Lukas Drude, Jens Heitkaemper, Christoph Boeddeker, Reinhold Haeb-Umbach

We present a multi-channel database of overlapping speech for training, evaluation, and detailed analysis of source separation and extraction algorithms: SMS-WSJ -- Spatialized Multi-Speaker Wall Street Journal.

Jointly optimal dereverberation and beamforming

no code implementations30 Oct 2019 Christoph Boeddeker, Tomohiro Nakatani, Keisuke Kinoshita, Reinhold Haeb-Umbach

We previously proposed an optimal (in the maximum likelihood sense) convolutional beamformer that can perform simultaneous denoising and dereverberation, and showed its superiority over the widely used cascade of a WPE dereverberation filter and a conventional MPDR beamformer.


An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription

1 code implementation26 Sep 2019 Catalin Zorila, Christoph Boeddeker, Rama Doddipatla, Reinhold Haeb-Umbach

Despite the strong modeling power of neural network acoustic models, speech enhancement has been shown to deliver additional word error rate improvements if multi-channel data is available.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.