no code implementations • IWSLT (EMNLP) 2018 • Matthias Sperber, Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Thanh-Le Ha, Sebastian Stüker, Alex Waibel
The baseline system is a cascade of an ASR system, a system to segment the ASR output and a neural machine translation system.
no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.
no code implementations • IWSLT 2016 • Eunah Cho, Jan Niehues, Thanh-Le Ha, Matthias Sperber, Mohammed Mediani, Alex Waibel
In addition, we investigated methods to combine NMT systems that encode the input as well as the output differently.
no code implementations • IWSLT 2017 • Ngoc-Quan Pham, Matthias Sperber, Elizabeth Salesky, Thanh-Le Ha, Jan Niehues, Alexander Waibel
For the SLT track, in addition to a monolingual neural translation system used to generate correct punctuations and true cases of the data prior to training our multilingual system, we introduced a noise model in order to make our system more robust.
no code implementations • IWSLT 2016 • Thai-Son Nguyen, Markus Müller, Matthias Sperber, Thomas Zenkel, Kevin Kilgour, Sebastian Stüker, Alex Waibel
For the English TED task, our best combination system has a WER of 7. 8% on the development set while our other combinations gained 21. 8% and 28. 7% WERs for the English and German MSLT tasks.
no code implementations • IWSLT 2016 • Micha Wetzel, Matthias Sperber, Alexander Waibel
Speech that contains multimedia content can pose a serious challenge for real-time automatic speech recognition (ASR) for two reasons: (1) The ASR produces meaningless output, hurting the readability of the transcript.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • IWSLT 2017 • Matthias Sperber, Jan Niehues, Alex Waibel
We note that unlike our baseline model, models trained on noisy data are able to generate outputs of proper length even for noisy inputs, while gradually reducing output length for higher amount of noise, as might also be expected from a human translator.
no code implementations • IWSLT 2017 • Thai-Son Nguyen, Markus Müller, Matthias Sperber, Thomas Zenkel, Sebastian Stüker, Alex Waibel
For the English lecture task, our best combination system has a WER of 8. 3% on the tst2015 development set while our other combinations gained 25. 7% WER for German lecture tasks.
no code implementations • 7 Nov 2024 • Ibrahim Said Ahmad, Antonios Anastasopoulos, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, Barry Haddow, Dávid Javorský, Mateusz Krubiński, Tsz Kin Lam, Xutai Ma, Prashant Mathur, Evgeny Matusov, Chandresh Maurya, John McCrae, Kenton Murray, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, Atul Kr. Ojha, John Ortega, Sara Papi, Peter Polák, Adam Pospíšil, Pavel Pecina, Elizabeth Salesky, Nivedita Sethiya, Balaram Sarkar, Jiatong Shi, Claytone Sikasote, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Brian Thompson, Marco Turchi, Alex Waibel, Shinji Watanabe, Patrick Wilken, Petr Zemánek, Rodolfo Zevallos
This paper reports on the shared tasks organized by the 21st IWSLT Conference.
no code implementations • 31 Oct 2024 • Ioannis Tsiamas, Matthias Sperber, Andrew Finch, Sarthak Garg
The prosody of a spoken utterance, including features like stress, intonation and rhythm, can significantly affect the underlying semantics, and as a consequence can also affect its textual translation.
no code implementations • 6 Jun 2024 • Matthias Sperber, Ondřej Bojar, Barry Haddow, Dávid Javorský, Xutai Ma, Matteo Negri, Jan Niehues, Peter Polák, Elizabeth Salesky, Katsuhito Sudoh, Marco Turchi
Human evaluation is a critical component in machine translation system development and has received much attention in text translation research.
1 code implementation • 19 Oct 2023 • Belen Alastruey, Matthias Sperber, Christian Gollan, Dominic Telaar, Tim Ng, Aashish Agarwal
Code-switching (CS), i. e. mixing different languages in a single sentence, is a common phenomenon in communication and can be challenging in many Natural Language Processing (NLP) settings.
1 code implementation • 5 Sep 2023 • Javier Ferrando, Matthias Sperber, Hendra Setiawan, Dominic Telaar, Saša Hasan
Behavioral testing in NLP allows fine-grained evaluation of systems by examining their linguistic capabilities through the analysis of input-output behavior.
no code implementations • 20 Dec 2022 • Mozhdeh Gheini, Tatiana Likhomanenko, Matthias Sperber, Hendra Setiawan
Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language.
2 code implementations • Findings (ACL) 2022 • Orion Weller, Matthias Sperber, Telmo Pires, Hendra Setiawan, Christian Gollan, Dominic Telaar, Matthias Paulik
Code switching (CS) refers to the phenomenon of interchangeably using words and phrases from different languages.
no code implementations • EACL 2021 • Orion Weller, Matthias Sperber, Christian Gollan, Joris Kluivers
However, all previous work has only looked at this problem from the consecutive perspective, leaving uncertainty on whether these approaches are effective in the more challenging streaming setting.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 24 Jul 2020 • Matthias Sperber, Hendra Setiawan, Christian Gollan, Udhyakumar Nallasamy, Matthias Paulik
To address various shortcomings of this paradigm, recent work explores end-to-end trainable direct models that translate without transcribing.
no code implementations • ACL 2020 • Hendra Setiawan, Matthias Sperber, Udhay Nallasamy, Matthias Paulik
Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables.
no code implementations • ACL 2020 • Matthias Sperber, Matthias Paulik
Over its three decade history, speech translation has experienced several shifts in its primary research themes; moving from loosely coupled cascades of speech recognition and machine translation, to exploring questions of tight coupling, and finally to end-to-end models that have recently attracted much attention.
no code implementations • 22 Mar 2020 • Thai Son Nguyen, Jan Niehues, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Muller, Matthias Sperber, Sebastian Stueker, Alex Waibel
User studies have shown that reducing the latency of our simultaneous lecture translation system should be the most important goal.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • ACL 2019 • Elizabeth Salesky, Matthias Sperber, Alan W. black
Previous work on end-to-end translation from speech has primarily used frame-level features as speech representations, which creates longer, sparser sequences than text.
no code implementations • ACL 2019 • Matthias Sperber, Graham Neubig, Ngoc-Quan Pham, Alex Waibel
Lattices are an efficient and effective method to encode ambiguity of upstream systems in natural language processing tasks, for example to compactly capture multiple speech recognition hypotheses, or to represent multiple linguistic analyses.
no code implementations • NAACL 2019 • Elizabeth Salesky, Matthias Sperber, Alex Waibel
Spoken language translation applications for speech suffer due to conversational speech phenomena, particularly the presence of disfluencies.
no code implementations • TACL 2019 • Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel
Speech translation has traditionally been approached through cascaded models consisting of a speech recognizer trained on a corpus of transcribed speech, and a machine translation system trained on parallel texts.
no code implementations • ACL 2019 • Zhong Zhou, Matthias Sperber, Alex Waibel
Our multi-paraphrase NMT that trains only on two languages outperforms the multilingual baselines.
no code implementations • 1 Aug 2018 • Jan Niehues, Ngoc-Quan Pham, Thanh-Le Ha, Matthias Sperber, Alex Waibel
After adaptation, we are able to reduce the number of corrections displayed during incremental output construction by 45%, without a decrease in translation quality.
no code implementations • COLING 2018 • Florian Dessloch, Thanh-Le Ha, Markus M{\"u}ller, Jan Niehues, Thai-Son Nguyen, Ngoc-Quan Pham, Elizabeth Salesky, Matthias Sperber, Sebastian St{\"u}ker, Thomas Zenkel, Alex Waibel, er
{\%} Combining these techniques, we are able to provide an adapted speech translation system for several European languages.
no code implementations • WS 2018 • Zhong Zhou, Matthias Sperber, Alex Waibel
The main challenges we identify are the lack of low-resource language data, effective methods for cross-lingual transfer, and the variable-binding problem that is common in neural systems.
1 code implementation • 26 Mar 2018 • Matthias Sperber, Jan Niehues, Graham Neubig, Sebastian Stüker, Alex Waibel
Self-attention is a method of encoding sequences of vectors by relating these vectors to each-other based on pairwise similarities.
1 code implementation • WS 2018 • Graham Neubig, Matthias Sperber, Xinyi Wang, Matthieu Felix, Austin Matthews, Sarguna Padmanabhan, Ye Qi, Devendra Singh Sachan, Philip Arthur, Pierre Godard, John Hewitt, Rachid Riad, Liming Wang
In this paper we describe the design of XNMT and its experiment configuration system, and demonstrate its utility on the tasks of machine translation, speech recognition, and multi-tasked machine translation/parsing.
no code implementations • 15 Sep 2017 • Matthias Sperber, Graham Neubig, Jan Niehues, Satoshi Nakamura, Alex Waibel
We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion.
no code implementations • 15 Aug 2017 • Thomas Zenkel, Ramon Sanabria, Florian Metze, Jan Niehues, Matthias Sperber, Sebastian Stüker, Alex Waibel
The CTC loss function maps an input sequence of observable feature vectors to an output sequence of symbols.
no code implementations • EMNLP 2017 • Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel
In this work, we extend the TreeLSTM (Tai et al., 2015) into a LatticeLSTM that is able to consume word lattices, and can be used as encoder in an attentional encoder-decoder model.
no code implementations • COLING 2016 • Matthias Sperber, Graham Neubig, Jan Niehues, Sebastian St{\"u}ker, Alex Waibel
Evaluating the quality of output from language processing systems such as machine translation or speech recognition is an essential step in ensuring that they are sufficient for practical use.
no code implementations • NAACL 2016 • Markus M{\"u}ller, Thai Son Nguyen, Jan Niehues, Eunah Cho, Bastian Kr{\"u}ger, Thanh-Le Ha, Kevin Kilgour, Matthias Sperber, Mohammed Mediani, Sebastian St{\"u}ker, Alex Waibel
no code implementations • LREC 2016 • Matthias Sperber, Graham Neubig, Satoshi Nakamura, Alex Waibel
Our goal is to improve the human transcription quality via appropriate user interface design.
no code implementations • TACL 2014 • Matthias Sperber, Mirjam Simantzik, Graham Neubig, Satoshi Nakamura, Alex Waibel
In this paper, we study the problem of manually correcting automatic annotations of natural language in as efficient a manner as possible.