no code implementations • 18 Apr 2023 • Maurits Bleeker, Pawel Swietojanski, Stefan Braun, Xiaodan Zhuang
By including approximate nearest neighbour phrases (ANN-P) in the context list, we encourage the learned representation to disambiguate between similar, but not identical, biasing phrases.
no code implementations • 29 Nov 2022 • Stefan Braun, Erik McDermott, Roger Hsiao
As a highlight, we manage to compute the transducer loss and gradients for a batch size of 1024, and audio length of 40 seconds, using only 6 GB of memory.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Nov 2022 • Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang
This work studies the use of attention masking in transformer transducer based speech recognition for building a single configurable model for different deployment scenarios.
no code implementations • 21 Oct 2022 • Thien Nguyen, Nathalie Tran, Liuhui Deng, Thiago Fraga da Silva, Matthew Radzihovsky, Roger Hsiao, Henry Mason, Stefan Braun, Erik McDermott, Dogan Can, Pawel Swietojanski, Lyan Verwimp, Sibel Oyman, Tresi Arvizo, Honza Silovsky, Arnab Ghoshal, Mathieu Martel, Bharat Ram Ambati, Mohamed Ali
Code-switching describes the practice of using more than one language in the same sentence.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 20 Sep 2022 • Joscha Grüger, Tobias Geyer, Martin Kuhn, Stefan Braun, Ralph Bergmann
Thus, this technique is predestined to be used in the medical context for the comparison of treatment cases with clinical guidelines.
no code implementations • 2 Nov 2020 • Ting-yao Hu, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Stefan Braun, Kyuyeon Hwang, Ozlem Kalinli, Oncel Tuzel
Our policy adapts the augmentation parameters based on the training loss of the data samples.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 5 Jun 2018 • Stefan Braun
This study provides benchmarks for different implementations of LSTM units between the deep learning frameworks PyTorch, TensorFlow, Lasagne and Keras.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • ICLR 2018 • Stefan Braun, Daniel Neil, Enea Ceolini, Jithendar Anumula, Shih-Chii Liu
Recent work on encoder-decoder models for sequence-to-sequence mapping has shown that integrating both temporal and spatial attention mechanisms into neural networks increases the performance of the system substantially.
no code implementations • 22 Jun 2016 • Stefan Braun, Daniel Neil, Shih-Chii Liu
The performance of automatic speech recognition systems under noisy environments still leaves room for improvement.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2