no code implementations • IWSLT 2016 • Thai-Son Nguyen, Markus Müller, Matthias Sperber, Thomas Zenkel, Kevin Kilgour, Sebastian Stüker, Alex Waibel
For the English TED task, our best combination system has a WER of 7. 8% on the development set while our other combinations gained 21. 8% and 28. 7% WERs for the English and German MSLT tasks.
no code implementations • IWSLT 2016 • Markus Müller, Sebastian Stüker, Alex Waibel
For system training, we use additional data from French, German and Turkish.
no code implementations • IWSLT 2017 • Thai-Son Nguyen, Markus Müller, Matthias Sperber, Thomas Zenkel, Sebastian Stüker, Alex Waibel
For the English lecture task, our best combination system has a WER of 8. 3% on the tst2015 development set while our other combinations gained 25. 7% WER for German lecture tasks.
no code implementations • IWSLT (EMNLP) 2018 • Matthias Sperber, Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Thanh-Le Ha, Sebastian Stüker, Alex Waibel
The baseline system is a cascade of an ASR system, a system to segment the ASR output and a neural machine translation system.
no code implementations • 23 Feb 2022 • Gencer Sumbul, Markus Müller, Begüm Demir
Due to the availability of multi-modal remote sensing (RS) image archives, one of the most important research topics is the development of cross-modal RS image retrieval (CM-RSIR) methods that search semantically similar images across different modalities.
no code implementations • 11 Mar 2021 • Valentina Ros, Markus Müller
The first occurs between a non-resonating insulator and an intermittent metal.
Disordered Systems and Neural Networks Statistical Mechanics
no code implementations • 18 Nov 2020 • Bhuvan Agrawal, Markus Müller, Martin Radfar, Samridhi Choudhary, Athanasios Mouchtaris, Siegfried Kunzmann
In this paper, we treat an E2E system as a multi-modal model, with audio and text functioning as its two modalities, and use a cross-modal latent space (CMLS) architecture, where a shared latent space is learned between the `acoustic' and `text' embeddings.
no code implementations • 8 Jul 2020 • Surabhi Punjabi, Harish Arsikere, Zeynab Raeesy, Chander Chandak, Nikhil Bhave, Ankish Bansal, Markus Müller, Sergio Murillo, Ariya Rastrow, Sri Garimella, Roland Maas, Mat Hans, Athanasios Mouchtaris, Siegfried Kunzmann
Experiments show that for English-Spanish, the bilingual joint ASR-LID architecture matches monolingual ASR and acoustic-only LID accuracies.
no code implementations • 30 Apr 2019 • Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel
Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.
no code implementations • 5 Jul 2018 • Markus Müller, Sebastian Stüker, Alex Waibel
Multilingual Speech Recognition is one of the most costly AI problems, because each language (7, 000+) and even different accents require their own acoustic models to obtain best recognition performance.
no code implementations • 13 Nov 2017 • Markus Müller, Sebastian Stüker, Alex Waibel
In this work, we focus on multilingual systems based on recurrent neural networks (RNNs), trained using the Connectionist Temporal Classification (CTC) loss function.
no code implementations • 13 Nov 2017 • Markus Müller, Sebastian Stüker, Alex Waibel
We evaluated the use of different language combinations as well as the addition of Language Feature Vectors (LFVs).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 2 Jun 2017 • Robin Ruede, Markus Müller, Sebastian Stüker, Alex Waibel
BCs can be expressed in different ways, depending on the modality of the interaction, for example as gestures or acoustic cues.