Search Results for author: Lukas Drude

Found 12 papers, 2 papers with code

Promptformer: Prompted Conformer Transducer for ASR

no code implementations • 14 Jan 2024 • Sergio Duarte-Torres, Arunasish Sen, Aman Rana, Lukas Drude, Alejandro Gomez-Alanis, Andreas Schwarz, Leif Rädel, Volker Leutnant

Context cues carry information which can improve multi-turn interactions in automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Multi-View Frequency-Attention Alternative to CNN Frontends for Automatic Speech Recognition

no code implementations • 12 Jun 2023 • Belen Alastruey, Lukas Drude, Jahn Heymann, Simon Wiesler

Convolutional frontends are a typical choice for Transformer-based automatic speech recognition to preprocess the spectrogram, reduce its sequence length, and combine local information in time and frequency similarly.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Contextual-Utterance Training for Automatic Speech Recognition

no code implementations • 27 Oct 2022 • Alejandro Gomez-Alanis, Lukas Drude, Andreas Schwarz, Rupak Vignesh Swaminathan, Simon Wiesler

Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget

no code implementations • 15 Jun 2021 • Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin

Automatic speech recognition (ASR) in the cloud allows the use of larger models and more powerful multi-channel signal processing front-ends compared to on-device processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR

no code implementations • 4 Jun 2020 • Thilo von Neumann, Christoph Boeddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Most approaches to multi-talker overlapped speech separation and recognition assume that the number of simultaneously active speakers is given, but in realistic situations, it is typically unknown.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

End-to-end training of time domain audio separation and recognition

no code implementations • 18 Dec 2019 • Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Boeddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

The rising interest in single-channel multi-speaker speech separation sparked development of End-to-End (E2E) approaches to multi-speaker speech recognition.

Speaker Recognition speech-recognition +2

Paper
Add Code

Demystifying TasNet: A Dissecting Approach

no code implementations • 20 Nov 2019 • Jens Heitkaemper, Darius Jakobeit, Christoph Boeddeker, Lukas Drude, Reinhold Haeb-Umbach

In recent years time domain speech separation has excelled over frequency domain separation in single channel scenarios and noise-free environments.

Speech Separation

Paper
Add Code

SMS-WSJ: Database, performance measures, and baseline recipe for multi-channel source separation and recognition

3 code implementations • 30 Oct 2019 • Lukas Drude, Jens Heitkaemper, Christoph Boeddeker, Reinhold Haeb-Umbach

We present a multi-channel database of overlapping speech for training, evaluation, and detailed analysis of source separation and extraction algorithms: SMS-WSJ -- Spatialized Multi-Speaker Wall Street Journal.

Position

Paper
Code

Unsupervised training of a deep clustering model for multichannel blind source separation

no code implementations • 2 Apr 2019 • Lukas Drude, Daniel Hasenklever, Reinhold Haeb-Umbach

We propose a training scheme to train neural network-based source separation algorithms from scratch when parallel clean data is unavailable.

blind source separation Clustering +2

Paper
Add Code

Unsupervised training of neural mask-based beamforming

no code implementations • 2 Apr 2019 • Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach

In contrast to previous work on unsupervised training of neural mask estimators, our approach avoids the need for a possibly pre-trained teacher model entirely.

speech-recognition Speech Recognition

Paper
Add Code

Directional Statistics and Filtering Using libDirectional

no code implementations • 28 Dec 2017 • Gerhard Kurz, Igor Gilitschenski, Florian Pfaff, Lukas Drude, Uwe D. Hanebeck, Reinhold Haeb-Umbach, Roland Y. Siegwart

In this paper, we present libDirectional, a MATLAB library for directional statistics and directional estimation.

Paper
Add Code

Neural network based spectral mask estimation for acoustic beamforming

1 code implementation • ICASSP 2016 • Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach

The network training is independent of the number and the geometric configuration of the microphones.

speech-recognition Speech Recognition

185

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.