Search Results for author: Jan Niehues

Found 95 papers, 13 papers with code

The IWSLT 2019 Evaluation Campaign

no code implementations EMNLP (IWSLT) 2019 Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.


Maastricht University’s Large-Scale Multilingual Machine Translation System for WMT 2021

no code implementations WMT (EMNLP) 2021 Danni Liu, Jan Niehues

We present our development of the multilingual machine translation system for the large-scale multilingual machine translation task at WMT 2021.

Machine Translation Translation

Effective combination of pretrained models - KIT@IWSLT2022

no code implementations IWSLT (ACL) 2022 Ngoc-Quan Pham, Tuan Nam Nguyen, Thai-Binh Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Alexander Waibel

Pretrained models in acoustic and textual modalities can potentially improve speech translation for both Cascade and End-to-end approaches.


The IWSLT 2018 Evaluation Campaign

no code implementations IWSLT (EMNLP) 2018 Jan Niehues, Rolando Cattoni, Sebastian Stüker, Mauro Cettolo, Marco Turchi, Marcello Federico

The International Workshop of Spoken Language Translation (IWSLT) 2018 Evaluation Campaign featured two tasks: low-resource machine translation and speech translation.

Machine Translation Translation

Domain-independent Punctuation and Segmentation Insertion

no code implementations IWSLT 2017 Eunah Cho, Jan Niehues, Alex Waibel

Experiments show that generalizing rare and unknown words greatly improves the punctuation insertion performance, reaching up to 8. 8 points of improvement in F-score when applied to the out-of-domain test scenario.

Machine Translation POS +1

KIT’s Multilingual Neural Machine Translation systems for IWSLT 2017

no code implementations IWSLT 2017 Ngoc-Quan Pham, Matthias Sperber, Elizabeth Salesky, Thanh-Le Ha, Jan Niehues, Alexander Waibel

For the SLT track, in addition to a monolingual neural translation system used to generate correct punctuations and true cases of the data prior to training our multilingual system, we introduced a noise model in order to make our system more robust.

Machine Translation NMT +1


no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.


Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Toward Robust Neural Machine Translation for Noisy Input Sequences

no code implementations IWSLT 2017 Matthias Sperber, Jan Niehues, Alex Waibel

We note that unlike our baseline model, models trained on noisy data are able to generate outputs of proper length even for noisy inputs, while gradually reducing output length for higher amount of noise, as might also be expected from a human translator.

Machine Translation Translation

Audience-specific Explanations for Machine Translation

no code implementations22 Sep 2023 Renhan Lou, Jan Niehues

In this work, we propose a semi-automatic technique to extract these explanations from a large parallel corpus.

Machine Translation Translation

How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

1 code implementation15 Sep 2023 Danni Liu, Jan Niehues

Given recent progress in pretrained massively multilingual translation models, we use them as a foundation to transfer the attribute controlling capabilities to languages without supervised data.

Machine Translation Translation

KIT's Multilingual Speech Translation System for IWSLT 2023

1 code implementation8 Jun 2023 Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks.

Data Augmentation Retrieval +1

Perturbation-based QE: An Explainable, Unsupervised Word-level Quality Estimation Method for Blackbox Machine Translation

1 code implementation12 May 2023 Tu Anh Dinh, Jan Niehues

Quality Estimation (QE) is the task of predicting the quality of Machine Translation (MT) system output, without using any gold-standard translation references.

Machine Translation Translation +1

Train Global, Tailor Local: Minimalist Multilingual Translation into Endangered Languages

no code implementations5 May 2023 Zhong Zhou, Jan Niehues, Alex Waibel

We examine two approaches: 1. best selection of seed sentences to jump start translations in a new language in view of best generalization to the remainder of a larger targeted text(s), and 2. we adapt large general multilingual translation engines from many other languages to focus on a specific text in a new, unknown language.

Humanitarian Translation

Towards continually learning new languages

no code implementations21 Nov 2022 Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training.

speech-recognition Speech Recognition +1

Efficient Speech Translation with Pre-trained Models

no code implementations9 Nov 2022 Zhaolin Li, Jan Niehues

When building state-of-the-art speech translation models, the need for large computational resources is a significant obstacle due to the large training data size and complex models.


Learning an Artificial Language for Knowledge-Sharing in Multilingual Translation

1 code implementation2 Nov 2022 Danni Liu, Jan Niehues

In this work, we discretize the encoder output latent space of multilingual models by assigning encoder states to entries in a codebook, which in effect represents source sentences in a new artificial language.


Adaptive multilingual speech recognition with pretrained models

no code implementations24 May 2022 Ngoc-Quan Pham, Alex Waibel, Jan Niehues

Multilingual speech recognition with supervised learning has achieved great results as reflected in recent research.

speech-recognition Speech Recognition

LibriS2S: A German-English Speech-to-Speech Translation Corpus

1 code implementation LREC 2022 Pedro Jeuris, Jan Niehues

In contrast, the activities in the area of speech-to-speech translation is still limited, although it is essential to overcome the language barrier.

Speech-to-Speech Translation Speech-to-Text Translation +1

Multilingual Simultaneous Speech Translation

no code implementations28 Mar 2022 Shashank Subramanya, Jan Niehues

Based on a technique to adapt end-to-end monolingual models, we investigate multilingual models and different architectures (end-to-end and cascade) on the ability to perform online speech translation.


Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques

1 code implementation26 Jan 2022 Tu Anh Dinh, Danni Liu, Jan Niehues

We investigate whether these ideas can be applied to speech translation, by building ST models trained on speech transcription and text translation data.

Data Augmentation Machine Translation +1

Cost-Effective Training in Low-Resource Neural Machine Translation

no code implementations14 Jan 2022 Sai Koneru, Danni Liu, Jan Niehues

Although AL is shown to be helpful with large budgets, it is not enough to build high-quality translation systems in these low-resource conditions.

Active Learning Domain Adaptation +3

Tutorial Proposal: End-to-End Speech Translation

no code implementations EACL 2021 Jan Niehues, Elizabeth Salesky, Marco Turchi, Matteo Negri

Speech translation is the translation of speech in one language typically to text in another, traditionally accomplished through a combination of automatic speech recognition and machine translation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Continuous Learning in Neural Machine Translation using Bilingual Dictionaries

no code implementations EACL 2021 Jan Niehues

For humans, as well as for machine translation, bilingual dictionaries are a promising knowledge source to continuously integrate new knowledge.

LEMMA Machine Translation +2

Improving Zero-Shot Translation by Disentangling Positional Information

1 code implementation ACL 2021 Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training.

Machine Translation Translation


no code implementations WS 2020 Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.


Adapting End-to-End Speech Recognition for Readable Subtitles

1 code implementation WS 2020 Danni Liu, Jan Niehues, Gerasimos Spanakis

The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Relative Positional Encoding for Speech Recognition and Direct Translation

no code implementations20 May 2020 Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel

We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.

Sentence segmentation speech-recognition +2

Incremental processing of noisy user utterances in the spoken language understanding task

no code implementations WS 2019 Stefan Constantin, Jan Niehues, Alex Waibel

The state-of-the-art neural network architectures make it possible to create spoken language understanding systems with high quality and fast processing time.

Natural Language Understanding Spoken Language Understanding

Very Deep Self-Attention Networks for End-to-End Speech Recognition

no code implementations30 Apr 2019 Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel

Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.

speech-recognition Speech Recognition

Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation

no code implementations TACL 2019 Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel

Speech translation has traditionally been approached through cascaded models consisting of a speech recognizer trained on a corpus of transcribed speech, and a machine translation system trained on parallel texts.

Machine Translation speech-recognition +2

Multi-task learning to improve natural language understanding

no code implementations17 Dec 2018 Stefan Constantin, Jan Niehues, Alex Waibel

When building a neural network-based Natural Language Understanding component, one main challenge is to collect enough training data.

Multi-Task Learning Natural Language Understanding

Optimizing Segmentation Granularity for Neural Machine Translation

no code implementations19 Oct 2018 Elizabeth Salesky, Andrew Runge, Alex Coda, Jan Niehues, Graham Neubig

However, the granularity of these subword units is a hyperparameter to be tuned for each language and task, using methods such as grid search.

Machine Translation NMT +1

Low-Latency Neural Speech Translation

no code implementations1 Aug 2018 Jan Niehues, Ngoc-Quan Pham, Thanh-Le Ha, Matthias Sperber, Alex Waibel

After adaptation, we are able to reduce the number of corrections displayed during incremental output construction by 45%, without a decrease in translation quality.

Machine Translation Multi-Task Learning +2

A Hierarchical Approach to Neural Context-Aware Modeling

no code implementations27 Jul 2018 Patrick Huber, Jan Niehues, Alex Waibel

Our approach overcomes recent limitations with extended narratives through a multi-layered computational approach to generate an abstract context representation.

Binary Classification Language Modelling

Robust and Scalable Differentiable Neural Computer for Question Answering

1 code implementation WS 2018 Jörg Franke, Jan Niehues, Alex Waibel

Deep learning models are often not easily adaptable to new tasks and require task-specific adjustments.

Question Answering

Self-Attentional Acoustic Models

1 code implementation26 Mar 2018 Matthias Sperber, Jan Niehues, Graham Neubig, Sebastian Stüker, Alex Waibel

Self-attention is a method of encoding sequences of vectors by relating these vectors to each-other based on pairwise similarities.

Automated Evaluation of Out-of-Context Errors

1 code implementation LREC 2018 Patrick Huber, Jan Niehues, Alex Waibel

We present a new approach to evaluate computational models for the task of text understanding by the means of out-of-context error detection.

Binary Classification Language Modelling +1

An End-to-End Goal-Oriented Dialog System with a Generative Natural Language Response Generation

no code implementations6 Mar 2018 Stefan Constantin, Jan Niehues, Alex Waibel

Furthermore, by using a feedforward neural network, we are able to generate the output word by word and are no longer restricted to a fixed number of possible response candidates.

Goal-Oriented Dialog Response Generation

Effective Strategies in Zero-Shot Neural Machine Translation

1 code implementation IWSLT 2017 Thanh-Le Ha, Jan Niehues, Alexander Waibel

In this paper, we proposed two strategies which can be applied to a multilingual neural machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus.

Machine Translation Translation

Transcribing Against Time

no code implementations15 Sep 2017 Matthias Sperber, Graham Neubig, Jan Niehues, Satoshi Nakamura, Alex Waibel

We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion.

Exploiting Linguistic Resources for Neural Machine Translation Using Multi-task Learning

no code implementations WS 2017 Jan Niehues, Eunah Cho

Linguistic resources such as part-of-speech (POS) tags have been extensively used in statistical machine translation (SMT) frameworks and have yielded better performances.

Machine Translation Multi-Task Learning +3

Analyzing Neural MT Search and Model Performance

no code implementations WS 2017 Jan Niehues, Eunah Cho, Thanh-Le Ha, Alex Waibel

By separating the search space and the modeling using $n$-best list reranking, we analyze the influence of both parts of an NMT system independently.

NMT Translation

Neural Lattice-to-Sequence Models for Uncertain Inputs

no code implementations EMNLP 2017 Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel

In this work, we extend the TreeLSTM (Tai et al., 2015) into a LatticeLSTM that is able to consume word lattices, and can be used as encoder in an attentional encoder-decoder model.


Lightly Supervised Quality Estimation

no code implementations COLING 2016 Matthias Sperber, Graham Neubig, Jan Niehues, Sebastian St{\"u}ker, Alex Waibel

Evaluating the quality of output from language processing systems such as machine translation or speech recognition is an essential step in ensuring that they are sufficient for practical use.

Automatic Speech Recognition (ASR) Machine Translation +2

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

no code implementations IWSLT 2016 Thanh-Le Ha, Jan Niehues, Alexander Waibel

In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach.

Machine Translation NMT +1

Lexical Translation Model Using a Deep Neural Network Architecture

no code implementations28 Apr 2015 Thanh-Le Ha, Jan Niehues, Alex Waibel

In this paper we combine the advantages of a model using global source sentence contexts, the Discriminative Word Lexicon, and neural networks.


Cannot find the paper you are looking for? You can Submit a new open access paper.