no code implementations • IWSLT 2017 • Ngoc-Quan Pham, Matthias Sperber, Elizabeth Salesky, Thanh-Le Ha, Jan Niehues, Alexander Waibel
For the SLT track, in addition to a monolingual neural translation system used to generate correct punctuations and true cases of the data prior to training our multilingual system, we introduced a noise model in order to make our system more robust.
no code implementations • IWSLT 2016 • Eunah Cho, Jan Niehues, Thanh-Le Ha, Matthias Sperber, Mohammed Mediani, Alex Waibel
In addition, we investigated methods to combine NMT systems that encode the input as well as the output differently.
no code implementations • IWSLT 2016 • Eunah Cho, Jan Niehues, Thanh-Le Ha, Alex Waibel
In this paper, we investigate a multilingual approach for speech disfluency removal.
no code implementations • IWSLT (EMNLP) 2018 • Matthias Sperber, Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Thanh-Le Ha, Sebastian Stüker, Alex Waibel
The baseline system is a cascade of an ASR system, a system to segment the ASR output and a neural machine translation system.
no code implementations • EMNLP (IWSLT) 2019 • Ngoc-Quan Pham, Thai-Son Nguyen, Thanh-Le Ha, Juan Hussain, Felix Schneider, Jan Niehues, Sebastian Stüker, Alexander Waibel
This paper describes KIT’s submission to the IWSLT 2019 Speech Translation task on two sub-tasks corresponding to two different datasets.
no code implementations • EMNLP (IWSLT) 2019 • Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico
The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.
no code implementations • ACL (IWSLT) 2021 • Ngoc-Quan Pham, Tuan Nam Nguyen, Thanh-Le Ha, Sebastian Stüker, Alexander Waibel, Dan He
This paper contains the description for the submission of Karlsruhe Institute of Technology (KIT) for the multilingual TEDx translation task in the IWSLT 2021 evaluation campaign.
no code implementations • ACL (IWSLT) 2021 • Tuan Nam Nguyen, Thai Son Nguyen, Christian Huber, Ngoc-Quan Pham, Thanh-Le Ha, Felix Schneider, Sebastian Stüker
We describe a system in both cascaded condition and end-to-end condition.
no code implementations • EAMT 2020 • Maciej Modrzejewski, Miriam Exel, Bianka Buschbeck, Thanh-Le Ha, Alexander Waibel
The correct translation of named entities (NEs) still poses a challenge for conventional neural machine translation (NMT) systems.
no code implementations • loresmt (AACL) 2020 • Thi-Vinh Ngo, Phuong-Thai Nguyen, Thanh-Le Ha, Khac-Quy Dinh, Le-Minh Nguyen
Prior works have demonstrated that a low-resource language pair can benefit from multilingual machine translation (MT) systems, which rely on many language pairs’ joint training.
no code implementations • 16 Dec 2020 • Thi-Vinh Ngo, Phuong-Thai Nguyen, Thanh-Le Ha, Khac-Quy Dinh, Le-Minh Nguyen
Prior works have demonstrated that a low-resource language pair can benefit from multilingual machine translation (MT) systems, which rely on many language pairs' joint training.
no code implementations • WS 2020 • Ngoc-Quan Pham, Felix Schneider, Tuan-Nam Nguyen, Thanh-Le Ha, Thai Son Nguyen, Maximilian Awiszus, Sebastian St{\"u}ker, Alex Waibel, er
This paper describes KIT{'}s submissions to the IWSLT2020 Speech Translation evaluation campaign.
no code implementations • 20 May 2020 • Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel
We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.
no code implementations • LREC 2020 • Dario Franceschini, Chiara Canton, Ivan Simonini, Armin Schweinfurth, Adelheid Glott, Sebastian St{\"u}ker, Thai-Son Nguyen, Felix Schneider, Thanh-Le Ha, Alex Waibel, Barry Haddow, Philip Williams, Rico Sennrich, Ond{\v{r}}ej Bojar, Sangeet Sagar, Dominik Mach{\'a}{\v{c}}ek, Otakar Smr{\v{z}}
This paper presents our progress towards deploying a versatile communication platform in the task of highly multilingual live speech translation for conferences and remote meetings live subtitling.
no code implementations • 22 Mar 2020 • Thai Son Nguyen, Jan Niehues, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Muller, Matthias Sperber, Sebastian Stueker, Alex Waibel
User studies have shown that reducing the latency of our simultaneous lecture translation system should be the most important goal.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • WS 2019 • Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, Le-Minh Nguyen
Among the six challenges of neural machine translation (NMT) coined by (Koehn and Knowles, 2017), rare-word problem is considered the most severe one, especially in translation of low-resource languages.
1 code implementation • EMNLP (IWSLT) 2019 • Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, Le-Minh Nguyen
While translating between East Asian languages, many works have discovered clear advantages of using characters as the translation unit.
no code implementations • WS 2019 • Ngoc-Quan Pham, Jan Niehues, Thanh-Le Ha, Alex Waibel
We investigated the behaviour of such models on the standard IWSLT 2017 multilingual dataset.
no code implementations • COLING 2018 • Florian Dessloch, Thanh-Le Ha, Markus M{\"u}ller, Jan Niehues, Thai-Son Nguyen, Ngoc-Quan Pham, Elizabeth Salesky, Matthias Sperber, Sebastian St{\"u}ker, Thomas Zenkel, Alex Waibel, er
{\%} Combining these techniques, we are able to provide an adapted speech translation system for several European languages.
no code implementations • 1 Aug 2018 • Jan Niehues, Ngoc-Quan Pham, Thanh-Le Ha, Matthias Sperber, Alex Waibel
After adaptation, we are able to reduce the number of corrections displayed during incremental output construction by 45%, without a decrease in translation quality.
1 code implementation • 18 May 2018 • Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, Le-Minh Nguyen
Neural machine translation (NMT) systems have recently obtained state-of-the art in many machine translation systems between popular language pairs because of the availability of data.
1 code implementation • IWSLT 2017 • Thanh-Le Ha, Jan Niehues, Alexander Waibel
In this paper, we proposed two strategies which can be applied to a multilingual neural machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus.
no code implementations • WS 2017 • Jan Niehues, Eunah Cho, Thanh-Le Ha, Alex Waibel
By separating the search space and the modeling using $n$-best list reranking, we analyze the influence of both parts of an NMT system independently.
no code implementations • IWSLT 2016 • Thanh-Le Ha, Jan Niehues, Alexander Waibel
In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach.
no code implementations • COLING 2016 • Jan Niehues, Eunah Cho, Thanh-Le Ha, Alex Waibel
We analyzed the influence of the quality of the initial system on the final result.
no code implementations • NAACL 2016 • Markus M{\"u}ller, Thai Son Nguyen, Jan Niehues, Eunah Cho, Bastian Kr{\"u}ger, Thanh-Le Ha, Kevin Kilgour, Matthias Sperber, Mohammed Mediani, Sebastian St{\"u}ker, Alex Waibel
no code implementations • 28 Apr 2015 • Thanh-Le Ha, Jan Niehues, Alex Waibel
In this paper we combine the advantages of a model using global source sentence contexts, the Discriminative Word Lexicon, and neural networks.