no code implementations • 12 Jun 2023 • Belen Alastruey, Lukas Drude, Jahn Heymann, Simon Wiesler
Convolutional frontends are a typical choice for Transformer-based automatic speech recognition to preprocess the spectrogram, reduce its sequence length, and combine local information in time and frequency similarly.
no code implementations • 27 Oct 2022 • Alejandro Gomez-Alanis, Lukas Drude, Andreas Schwarz, Rupak Vignesh Swaminathan, Simon Wiesler
Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 20 Nov 2020 • Andreas Schwarz, Ilya Sklyar, Simon Wiesler
We present a training scheme for streaming automatic speech recognition (ASR) based on recurrent neural network transducers (RNN-T) which allows the encoder network to learn to exploit context audio from a stream, using segmented or partially labeled sequences of the stream during training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 18 Nov 2016 • Yotaro Kubo, George Tucker, Simon Wiesler
We introduce dropout compaction, a novel method for training feed-forward neural networks which realizes the performance gains of training a large model with dropout regularization, yet extracts a compact neural network for run-time efficiency.
no code implementations • NeurIPS 2011 • Simon Wiesler, Hermann Ney
Log-linear models are widely used probability models for statistical pattern recognition.