no code implementations • 12 Jun 2023 • Belen Alastruey, Lukas Drude, Jahn Heymann, Simon Wiesler
Convolutional frontends are a typical choice for Transformer-based automatic speech recognition to preprocess the spectrogram, reduce its sequence length, and combine local information in time and frequency similarly.
no code implementations • 15 Jun 2021 • Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin
Automatic speech recognition (ASR) in the cloud allows the use of larger models and more powerful multi-channel signal processing front-ends compared to on-device processing.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Apr 2019 • Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach
In contrast to previous work on unsupervised training of neural mask estimators, our approach avoids the need for a possibly pre-trained teacher model entirely.
no code implementations • 30 Mar 2018 • Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai
This paper introduces a new open source platform for end-to-end speech processing named ESPnet.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • ICASSP 2016 • Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach
The network training is independent of the number and the geometric configuration of the microphones.