Search Results for author: Hermann Ney

Found 167 papers, 25 papers with code

A Convergence Analysis of Log-Linear Training

no code implementations NeurIPS 2011 Simon Wiesler, Hermann Ney

Log-linear models are widely used probability models for statistical pattern recognition.

Handwriting Recognition

Arabic-Segmentation Combination Strategies for Statistical Machine Translation

no code implementations LREC 2012 Saab Mansour, Hermann Ney

Next, we try a different strategy, where we combine the different segmentation methods rather than the different segmentation schemes.

Machine Translation Segmentation +1

RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus

no code implementations LREC 2012 Jens Forster, Christoph Schmidt, Thomas Hoyoux, Oscar Koller, Uwe Zelle, Justus Piater, Hermann Ney

This paper introduces the RWTH-PHOENIX-Weather corpus, a video-based, large vocabulary corpus of German Sign Language suitable for statistical sign language recognition and translation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Extensions of the Sign Language Recognition and Translation Corpus RWTH-PHOENIX-Weather

no code implementations LREC 2014 Jens Forster, Christoph Schmidt, Oscar Koller, Martin Bellgardt, Hermann Ney

This paper introduces the RWTH-PHOENIX-Weather 2014, a video-based, large vocabulary, German sign language corpus which has been extended over the last two years, tripling the size of the original corpus.

2k Object Tracking +5

Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data Is Continuous and Weakly Labelled

no code implementations CVPR 2016 Oscar Koller, Hermann Ney, Richard Bowden

Furthermore, we demonstrate its use in continuous sign language recognition on two publicly available large sign language data sets, where it outperforms the current state-of-the-art by a large margin.

Sign Language Recognition Video Recognition

A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition

no code implementations22 Jun 2016 Albert Zeyer, Patrick Doetsch, Paul Voigtlaender, Ralf Schlüter, Hermann Ney

On this task, we get our best result with an 8 layer bidirectional LSTM and we show that a pretraining scheme with layer-wise construction helps for deep LSTMs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Local System Voting Feature for Machine Translation System Combination

no code implementations WS 2015 Markus Freitag, Jan-Thorsten Peter, Stephan Peitz, Minwei Feng, Hermann Ney

In this paper, we enhance the traditional confusion network system combination approach with an additional model trained by a neural network.

Machine Translation Sentence +1

Re-Sign: Re-Aligned End-To-End Sequence Modelling With Deep Recurrent CNN-HMMs

no code implementations CVPR 2017 Oscar Koller, Sepehr Zargaran, Hermann Ney

This work presents an iterative re-alignment approach applicable to visual sequence labelling tasks such as gesture recognition, activity recognition and continuous sign language recognition.

Activity Recognition Gesture Recognition +1

Hybrid Neural Network Alignment and Lexicon Model in Direct HMM for Statistical Machine Translation

no code implementations ACL 2017 Weiyue Wang, Tamer Alkhouli, Derui Zhu, Hermann Ney

Recently, the neural machine translation systems showed their promising performance and surpassed the phrase-based systems for most translation tasks.

Machine Translation Translation +1

Improved training of end-to-end attention models for speech recognition

14 code implementations8 May 2018 Albert Zeyer, Kazuki Irie, Ralf Schlüter, Hermann Ney

Sequence-to-sequence attention-based models on subword units allow simple open-vocabulary end-to-end speech recognition.

Ranked #43 on Speech Recognition on LibriSpeech test-clean (using extra training data)

Language Modelling Speech Recognition

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

3 code implementations ACL 2018 Albert Zeyer, Tamer Alkhouli, Hermann Ney

We compare the fast training and decoding speed of RETURNN of attention models for translation, due to fast CUDA LSTM kernels, and a fast pure TensorFlow beam search decoder.

speech-recognition Speech Recognition +1

Neural Sign Language Translation

1 code implementation CVPR 2018 Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, Richard Bowden

SLR seeks to recognize a sequence of continuous signs but neglects the underlying rich grammatical and linguistic structures of sign language that differ from spoken language.

Gesture Recognition Language Modelling +5

Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition

no code implementations19 Jun 2018 Tobias Menne, Ralf Schlüter, Hermann Ney

The proposed adaptation approach is based on the integration of the beamformer, which includes the mask estimation network, and the acoustic model of the ASR system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Neural Hidden Markov Model for Machine Translation

no code implementations ACL 2018 Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan, Hermann Ney

Attention-based neural machine translation (NMT) models selectively focus on specific source positions to produce a translation, which brings significant improvements over pure encoder-decoder sequence-to-sequence models.

Machine Translation NMT +1

On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation

no code implementations WS 2018 Tamer Alkhouli, Gabriel Bretschner, Hermann Ney

This work investigates the alignment problem in state-of-the-art multi-head attention models based on the transformer architecture.

Machine Translation Translation

Improving Neural Language Models with Weight Norm Initialization and Regularization

no code implementations WS 2018 Christian Herold, Yingbo Gao, Hermann Ney

Embedding and projection matrices are commonly used in neural language models (NLM) as well as in other sequence processing networks that operate on large vocabularies.

Automatic Speech Recognition (ASR) Language Modelling +1

The RWTH Aachen University English-German and German-English Unsupervised Neural Machine Translation Systems for WMT 2018

no code implementations WS 2018 Miguel Gra{\c{c}}a, Yunsu Kim, Julian Schamper, Jiahui Geng, Hermann Ney

This paper describes the unsupervised neural machine translation (NMT) systems of the RWTH Aachen University developed for the English ↔ German news translation task of the \textit{EMNLP 2018 Third Conference on Machine Translation} (WMT 2018).

Machine Translation NMT +2

The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018

1 code implementation WS 2018 Julian Schamper, Jan Rosendahl, Parnia Bahar, Yunsu Kim, Arne Nix, Hermann Ney

In total we improve by 6. 8{\%} BLEU over our last year{'}s submission and by 4. 8{\%} BLEU over the winning system of the 2017 German→English task.

Machine Translation Translation

Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation

1 code implementation EMNLP 2018 Parnia Bahar, Christopher Brix, Hermann Ney

This work investigates an alternative model for neural machine translation (NMT) and proposes a novel architecture, where we employ a multi-dimensional long short-term memory (MDLSTM) for translation modeling.

Machine Translation NMT +2

Sisyphus, a Workflow Manager Designed for Machine Translation and Automatic Speech Recognition

no code implementations EMNLP 2018 Jan-Thorsten Peter, Eugen Beck, Hermann Ney

Training and testing many possible parameters or model architectures of state-of-the-art machine translation or automatic speech recognition system is a cumbersome task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder

no code implementations EMNLP 2018 Yunsu Kim, Jiahui Geng, Hermann Ney

Unsupervised learning of cross-lingual word embedding offers elegant matching of words across languages, but has fundamental limitations in translating sentences.

Denoising Language Modelling +2

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

no code implementations EACL 2017 Yunsu Kim, Julian Schamper, Hermann Ney

We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words.

Translation

A Comparative Study on Vocabulary Reduction for Phrase Table Smoothing

no code implementations WS 2016 Yunsu Kim, Andreas Guta, Joern Wuebker, Hermann Ney

This work systematically analyzes the smoothing effect of vocabulary reduction for phrase translation models.

Translation

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation

2 code implementations8 May 2019 Christoph Lüscher, Eugen Beck, Kazuki Irie, Markus Kitza, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney

To the best knowledge of the authors, the results obtained when training on the full LibriSpeech training set, are the best published currently, both for the hybrid DNN/HMM and the attention-based systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech

no code implementations9 May 2019 Tobias Menne, Ilya Sklyar, Ralf Schlüter, Hermann Ney

In a more realistic ASR scenario the audio signal contains significant portions of single-speaker speech and only part of the signal contains speech of multiple competing speakers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Language Modeling with Deep Transformers

no code implementations10 May 2019 Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney

We explore deep autoregressive Transformer models in language modeling for speech recognition.

Language Modelling speech-recognition +1

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

1 code implementation ACL 2019 Yunsu Kim, Yingbo Gao, Hermann Ney

Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies.

Cross-Lingual Transfer Low-Resource Neural Machine Translation +3

Cumulative Adaptation for BLSTM Acoustic Models

no code implementations14 Jun 2019 Markus Kitza, Pavel Golik, Ralf Schlüter, Hermann Ney

Further, i-vectors were used as an input to the neural network to perform instantaneous speaker and environment adaptation, providing 8\% relative improvement in word error rate on the NIST Hub5 2000 evaluation test set.

Acoustic Modelling Automatic Speech Recognition +4

Generalizing Back-Translation in Neural Machine Translation

no code implementations WS 2019 Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, Hermann Ney

Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT).

Data Augmentation Machine Translation +3

LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring

no code implementations1 Jul 2019 Eugen Beck, Wei Zhou, Ralf Schlüter, Hermann Ney

LSTM based language models are an important part of modern LVCSR systems as they significantly improve performance over traditional backoff language models.

Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

no code implementations1 Jul 2019 Wilfried Michel, Ralf Schlüter, Hermann Ney

This allows for a direct comparison of lattice-based and lattice-free sequence discriminative training criteria such as MMI and sMBR, both using the same language model during training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

EED: Extended Edit Distance Measure for Machine Translation

no code implementations WS 2019 Peter Stanchev, Weiyue Wang, Hermann Ney

Over the years a number of machine translation metrics have been developed in order to evaluate the accuracy and quality of machine-generated translations.

Machine Translation Translation

The RWTH Aachen University Machine Translation Systems for WMT 2019

no code implementations WS 2019 Jan Rosendahl, Christian Herold, Yunsu Kim, Miguel Gra{\c{c}}a, Weiyue Wang, Parnia Bahar, Yingbo Gao, Hermann Ney

For the De-En task, none of the tested methods gave a significant improvement over last years winning system and we end up with the same performance, resulting in 39. 6{\%} BLEU on newstest2019.

Attribute Language Modelling +3

uniblock: Scoring and Filtering Corpus with Unicode Block Information

1 code implementation IJCNLP 2019 Yingbo Gao, Weiyue Wang, Hermann Ney

The preprocessing pipelines in Natural Language Processing usually involve a step of removing sentences consisted of illegal characters.

Language Modelling Machine Translation +4

Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages

no code implementations IJCNLP 2019 Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, Hermann Ney

We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i. e., source-pivot and pivot-target, leading to a significant improvement in source-target translation.

Machine Translation NMT +3

When and Why is Document-level Context Useful in Neural Machine Translation?

1 code implementation WS 2019 Yunsu Kim, Duc Thanh Tran, Hermann Ney

Document-level context has received lots of attention for compensating neural machine translation (NMT) of isolated sentences.

Machine Translation NMT +1

Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification

no code implementations EMNLP (IWSLT) 2019 Yingbo Gao, Christian Herold, Weiyue Wang, Hermann Ney

Prominently used in support vector machines and logistic regressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries.

General Classification Language Modelling +2

ELoPE: Fine-Grained Visual Classification with Efficient Localization, Pooling and Embedding

1 code implementation17 Nov 2019 Harald Hanselmann, Hermann Ney

The task of fine-grained visual classification (FGVC) deals with classification problems that display a small inter-class variance such as distinguishing between different bird species or car models.

Fine-Grained Image Classification General Classification

A Comparative Study on End-to-end Speech to Text Translation

no code implementations20 Nov 2019 Parnia Bahar, Tobias Bieschke, Hermann Ney

Recent advances in deep learning show that end-to-end speech to text translation model is a promising approach to direct the speech translation field.

Speech-to-Text Translation Translation

Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems

1 code implementation19 Dec 2019 Nick Rossenbach, Albert Zeyer, Ralf Schlüter, Hermann Ney

We achieve improvements of up to 33% relative in word-error-rate (WER) over a strong baseline with data-augmentation in a low-resource environment (LibriSpeech-100h), closing the gap to a comparable oracle experiment by more than 50\%.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment

no code implementations2 Apr 2020 Wei Zhou, Wilfried Michel, Kazuki Irie, Markus Kitza, Ralf Schlüter, Hermann Ney

We present a complete training pipeline to build a state-of-the-art hybrid HMM-based ASR system on the 2nd release of the TED-LIUM corpus.

Data Augmentation

When and Why is Unsupervised Neural Machine Translation Useless?

no code implementations EAMT 2020 Yunsu Kim, Miguel Graça, Hermann Ney

This paper studies the practicality of the current state-of-the-art unsupervised methods in neural machine translation (NMT).

Machine Translation NMT +2

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture

no code implementations ACL 2020 Christopher Brix, Parnia Bahar, Hermann Ney

Sparse models require less memory for storage and enable a faster inference by reducing the necessary number of FLOPs.

Fine-Grained Visual Classification with Efficient End-to-end Localization

no code implementations11 May 2020 Harald Hanselmann, Hermann Ney

The term fine-grained visual classification (FGVC) refers to classification tasks where the classes are very similar and the classification model needs to be able to find subtle differences to make the correct prediction.

Classification Fine-Grained Image Classification +1

Context-Dependent Acoustic Modeling without Explicit Phone Clustering

no code implementations15 May 2020 Tina Raissi, Eugen Beck, Ralf Schlüter, Hermann Ney

In this work, we address a direct phonetic context modeling for the hybrid deep neural network (DNN)/HMM, that does not build on any phone clustering algorithm for the determination of the HMM state inventory.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

A New Training Pipeline for an Improved Neural Transducer

1 code implementation19 May 2020 Albert Zeyer, André Merboldt, Ralf Schlüter, Hermann Ney

We compare the original training criterion with the full marginalization over all alignments, to the commonly used maximum approximation, which simplifies, improves and speeds up our training.

A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models

1 code implementation19 May 2020 Mohammad Zeineldeen, Albert Zeyer, Wei Zhou, Thomas Ng, Ralf Schlüter, Hermann Ney

Following the rationale of end-to-end modeling, CTC, RNN-T or encoder-decoder-attention models for automatic speech recognition (ASR) use graphemes or grapheme-based subword units based on e. g. byte-pair encoding (BPE).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Investigation of Large-Margin Softmax in Neural Language Modeling

no code implementations20 May 2020 Jingjing Huo, Yingbo Gao, Weiyue Wang, Ralf Schlüter, Hermann Ney

After that, we apply the best norm-scaling setup in combination with various margins and conduct neural language models rescoring experiments in automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Diving Deep into Context-Aware Neural Machine Translation

no code implementations WMT (EMNLP) 2020 Jingjing Huo, Christian Herold, Yingbo Gao, Leonard Dahlmann, Shahram Khadivi, Hermann Ney

Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e. g., document-level translation, or having meta-information.

Machine Translation NMT +1

Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition

no code implementations30 Oct 2020 Wei Zhou, Simon Berger, Ralf Schlüter, Hermann Ney

To join the advantages of classical and end-to-end approaches for speech recognition, we present a simple, novel and competitive approach for phoneme-based neural transducer modeling.

Language Modelling speech-recognition +1

Multi-Agent Mutual Learning at Sentence-Level and Token-Level for Neural Machine Translation

no code implementations Findings of the Association for Computational Linguistics 2020 Baohao Liao, Yingbo Gao, Hermann Ney

Mutual learning, where multiple agents learn collaboratively and teach one another, has been shown to be an effective way to distill knowledge for image classification tasks.

Image Classification Machine Translation +2

Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid

no code implementations24 Nov 2020 Parnia Bahar, Christopher Brix, Hermann Ney

Neural translation models have proven to be effective in capturing sufficient information from a source sentence and generating a high-quality target sentence.

Machine Translation Sentence +2

Tight Integrated End-to-End Training for Cascaded Speech Translation

no code implementations24 Nov 2020 Parnia Bahar, Tobias Bieschke, Ralf Schlüter, Hermann Ney

Direct speech translation is an alternative method to avoid error propagation; however, its performance is often behind the cascade system.

Translation

Neural Language Modeling for Named Entity Recognition

no code implementations COLING 2020 Zhihong Lei, Weiyue Wang, Christian Dugast, Hermann Ney

Named entity recognition is a key component in various natural language processing systems, and neural architectures provide significant improvements over conventional approaches.

Language Modelling named-entity-recognition +2

Unifying Input and Output Smoothing in Neural Machine Translation

no code implementations COLING 2020 Yingbo Gao, Baohao Liao, Hermann Ney

Soft contextualized data augmentation is a recent method that replaces one-hot representation of words with soft posterior distributions of an external language model, smoothing the input of neural machine translation systems.

Data Augmentation Language Modelling +2

Efficient Retrieval Augmented Generation from Unstructured Knowledge for Task-Oriented Dialog

1 code implementation9 Feb 2021 David Thulke, Nico Daheim, Christian Dugast, Hermann Ney

This paper summarizes our work on the first track of the ninth Dialog System Technology Challenge (DSTC 9), "Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access".

Retrieval

A study of latent monotonic attention variants

no code implementations30 Mar 2021 Albert Zeyer, Ralf Schlüter, Hermann Ney

We compare several monotonic latent models to our global soft attention baseline such as a hard attention model, a local windowed soft attention model, and a segmental soft attention model.

Hard Attention speech-recognition +1

On Architectures and Training for Raw Waveform Feature Extraction in ASR

no code implementations9 Apr 2021 Peter Vieting, Christoph Lüscher, Wilfried Michel, Ralf Schlüter, Hermann Ney

With the success of neural network based modeling in automatic speech recognition (ASR), many studies investigated acoustic modeling and learning of feature extractors directly based on the raw waveform.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures

no code implementations12 Apr 2021 Nick Rossenbach, Mohammad Zeineldeen, Benedikt Hilmes, Ralf Schlüter, Hermann Ney

We achieve a final word-error-rate of 3. 3%/10. 0% with a hybrid system on the clean/noisy test-sets, surpassing any previous state-of-the-art systems on Librispeech-100h that do not include unlabeled audio data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept

no code implementations13 Apr 2021 Wei Zhou, Albert Zeyer, André Merboldt, Ralf Schlüter, Hermann Ney

With the advent of direct models in automatic speech recognition (ASR), the formerly prevalent frame-wise acoustic modeling based on hidden Markov models (HMM) diversified into a number of modeling architectures like encoder-decoder attention models, transducer models and segmental models (direct HMM).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition

no code implementations19 Apr 2021 Wei Zhou, Mohammad Zeineldeen, Zuoyun Zheng, Ralf Schlüter, Hermann Ney

Subword units are commonly used for end-to-end automatic speech recognition (ASR), while a fully acoustic-oriented subword modeling approach is somewhat missing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

On Sampling-Based Training Criteria for Neural Language Modeling

no code implementations21 Apr 2021 Yingbo Gao, David Thulke, Alexander Gerstenberger, Khoa Viet Tran, Ralf Schlüter, Hermann Ney

As the vocabulary size of modern word-based language models becomes ever larger, many sampling-based training criteria are proposed and investigated.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Why does CTC result in peaky behavior?

1 code implementation31 May 2021 Albert Zeyer, Ralf Schlüter, Hermann Ney

The peaky behavior of CTC models is well known experimentally.

Cascaded Span Extraction and Response Generation for Document-Grounded Dialog

1 code implementation ACL (dialdoc) 2021 Nico Daheim, David Thulke, Christian Dugast, Hermann Ney

For the second subtask, we use a cascaded model which grounds the response prediction on the predicted span instead of the full document.

Response Generation valid

Transformer-Based Direct Hidden Markov Model for Machine Translation

no code implementations ACL 2021 Weiyue Wang, Zijian Yang, Yingbo Gao, Hermann Ney

The neural hidden Markov model has been proposed as an alternative to attention mechanism in machine translation with recurrent neural networks.

Machine Translation Translation

On Language Model Integration for RNN Transducer based Speech Recognition

no code implementations13 Oct 2021 Wei Zhou, Zuoyun Zheng, Ralf Schlüter, Hermann Ney

In this work, we study various ILM correction-based LM integration methods formulated in a common RNN-T framework.

Language Modelling speech-recognition +1

Automatic Learning of Subword Dependent Model Scales

no code implementations18 Oct 2021 Felix Meyer, Wilfried Michel, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

We show on the LibriSpeech (LBS) and Switchboard (SWB) corpora that the model scales for a combination of attentionbased encoder-decoder acoustic model and language model can be learned as effectively as with manual tuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Efficient Sequence Training of Attention Models using Approximative Recombination

no code implementations18 Oct 2021 Nils-Philipp Wynands, Wilfried Michel, Jan Rosendahl, Ralf Schlüter, Hermann Ney

Lastly, it is shown that this technique can be used to effectively perform sequence discriminative training for attention-based encoder-decoder acoustic models on the LibriSpeech task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Conformer-based Hybrid ASR System for Switchboard Dataset

no code implementations5 Nov 2021 Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Wilfried Michel, Alexander Gerstenberger, Ralf Schlüter, Hermann Ney

The recently proposed conformer architecture has been successfully used for end-to-end automatic speech recognition (ASR) architectures achieving state-of-the-art performance on different datasets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Self-Normalized Importance Sampling for Neural Language Modeling

no code implementations11 Nov 2021 Zijian Yang, Yingbo Gao, Alexander Gerstenberger, Jintao Jiang, Ralf Schlüter, Hermann Ney

Compared to our previous work, the criteria considered in this work are self-normalized and there is no need to further conduct a correction step.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Adapting Document-Grounded Dialog Systems to Spoken Conversations using Data Augmentation and a Noisy Channel Model

1 code implementation16 Dec 2021 David Thulke, Nico Daheim, Christian Dugast, Hermann Ney

This paper summarizes our submission to Task 2 of the second track of the 10th Dialog System Technology Challenge (DSTC10) "Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations".

Data Augmentation Task 2

Efficient Training of Neural Transducer for Speech Recognition

no code implementations22 Apr 2022 Wei Zhou, Wilfried Michel, Ralf Schlüter, Hermann Ney

In this work, we propose an efficient 3-stage progressive training pipeline to build highly-performing neural transducer models from scratch with very limited computation resources in a reasonable short time period.

speech-recognition Speech Recognition

Improving the Training Recipe for a Robust Conformer-based Hybrid Model

no code implementations26 Jun 2022 Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Ralf Schlüter, Hermann Ney

In this work, we investigate various methods for speaker adaptive training (SAT) based on feature-space approaches for a conformer-based acoustic model (AM) on the Switchboard 300h dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Revisiting Checkpoint Averaging for Neural Machine Translation

no code implementations21 Oct 2022 Yingbo Gao, Christian Herold, Zijian Yang, Hermann Ney

Checkpoint averaging is a simple and effective method to boost the performance of converged neural machine translation models.

Machine Translation Translation

Does Joint Training Really Help Cascaded Speech Translation?

1 code implementation24 Oct 2022 Viet Anh Khoa Tran, David Thulke, Yingbo Gao, Christian Herold, Hermann Ney

Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Monotonic segmental attention for automatic speech recognition

1 code implementation26 Oct 2022 Albert Zeyer, Robin Schmitt, Wei Zhou, Ralf Schlüter, Hermann Ney

We restrict the decoder attention to segments to avoid quadratic runtime of global attention, better generalize to long sequences, and eventually enable streaming.

Automatic Speech Recognition Automatic Speech Recognition (ASR)

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

1 code implementation31 Oct 2022 Nico Daheim, David Thulke, Christian Dugast, Hermann Ney

In this work, we present a model for document-grounded response generation in dialog that is decomposed into two components according to Bayes theorem.

Response Generation

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

1 code implementation9 Nov 2022 Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz

We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers.

Enhancing and Adversarial: Improve ASR with Speaker Labels

no code implementations11 Nov 2022 Wei Zhou, Haotian Wu, Jingjing Xu, Mohammad Zeineldeen, Christoph Lüscher, Ralf Schlüter, Hermann Ney

Detailed analysis and experimental verification are conducted to show the optimal positions in the ASR neural network (NN) to apply speaker enhancing and adversarial training.

Multi-Task Learning

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers

no code implementations7 Dec 2022 Zijian Yang, Wei Zhou, Ralf Schlüter, Hermann Ney

Compared to the N-best-list based minimum Bayes risk objectives, lattice-free methods gain 40% - 70% relative training time speedup with a small degradation in performance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Analyzing And Improving Neural Speaker Embeddings for ASR

no code implementations11 Jan 2023 Christoph Lüscher, Jingjing Xu, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

By further adding neural speaker embeddings, we gain additional ~3% relative WER improvement on Hub5'00.

Speaker Verification

Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10

no code implementations14 Apr 2023 David Thulke, Nico Daheim, Christian Dugast, Hermann Ney

This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System Technology Challenges (DSTC9 and DSTC10).

Automatic Speech Recognition Data Augmentation +2

RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition

no code implementations28 May 2023 Wei Zhou, Eugen Beck, Simon Berger, Ralf Schlüter, Hermann Ney

Modern public ASR tools usually provide rich support for training various sequence-to-sequence (S2S) models, but rather simple support for decoding open-vocabulary scenarios only.

Sequence-To-Sequence Speech Recognition speech-recognition

On Search Strategies for Document-Level Neural Machine Translation

no code implementations8 Jun 2023 Christian Herold, Hermann Ney

On the other hand, in most works, the question on how to perform search with the trained model is scarcely discussed, sometimes not mentioned at all.

Machine Translation NMT +2

Improving Long Context Document-Level Machine Translation

no code implementations8 Jun 2023 Christian Herold, Hermann Ney

Document-level context for neural machine translation (NMT) is crucial to improve the translation consistency and cohesion, the translation of ambiguous inputs, as well as several other linguistic phenomena.

Document Level Machine Translation Machine Translation +3

Comparative Analysis of the wav2vec 2.0 Feature Extractor

no code implementations8 Aug 2023 Peter Vieting, Ralf Schlüter, Hermann Ney

In this work, we study its capability to replace the standard feature extraction methods in a connectionist temporal classification (CTC) ASR model and compare it to an alternative neural FE.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition

no code implementations15 Sep 2023 Mohammad Zeineldeen, Albert Zeyer, Ralf Schlüter, Hermann Ney

We study a streamable attention-based encoder-decoder model in which either the decoder, or both the encoder and decoder, operate on pre-defined, fixed-size windows called chunks.

speech-recognition Speech Recognition

On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers

no code implementations25 Sep 2023 Zijian Yang, Wei Zhou, Ralf Schlüter, Hermann Ney

Empirically, we show that ILM subtraction and sequence discriminative training achieve similar effects across a wide range of experiments on Librispeech, including both MMI and minimum Bayes risk (MBR) criteria, as well as neural transducers and LMs of both full and limited context.

Language Modelling Relation +2

End-to-End Training of a Neural HMM with Label and Transition Probabilities

1 code implementation4 Oct 2023 Daniel Mann, Tina Raissi, Wilfried Michel, Ralf Schlüter, Hermann Ney

We investigate recognition results and additionally Viterbi alignments of our models.

Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers

no code implementations11 Oct 2023 Zijian Yang, Wei Zhou, Ralf Schlüter, Hermann Ney

In this work, we investigate the effect of language models (LMs) with different context lengths and label units (phoneme vs. word) used in sequence discriminative training for phoneme-based neural transducers.

ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change

1 code implementation17 Jan 2024 David Thulke, Yingbo Gao, Petrus Pelser, Rein Brune, Rricha Jalota, Floris Fok, Michael Ramos, Ian van Wyk, Abdallah Nasir, Hayden Goldstein, Taylor Tragemann, Katie Nguyen, Ariana Fowler, Andrew Stanco, Jon Gabriel, Jordan Taylor, Dean Moro, Evgenii Tsymbalov, Juliette de Waal, Evgeny Matusov, Mudar Yaghi, Mohammad Shihadah, Hermann Ney, Christian Dugast, Jonathan Dotan, Daniel Erasmus

To increase the accessibility of our model to non-English speakers, we propose to make use of cascaded machine translation and show that this approach can perform comparably to natively multilingual models while being easier to scale to a large number of languages.

Machine Translation Retrieval

Recurrent Attention for the Transformer

no code implementations EMNLP (insights) 2021 Jan Rosendahl, Christian Herold, Frithjof Petrick, Hermann Ney

In this work, we conduct a comprehensive investigation on one of the centerpieces of modern machine translation systems: the encoder-decoder attention mechanism.

Machine Translation Translation

The RWTH Aachen Machine Translation Systems for IWSLT 2017

no code implementations IWSLT 2017 Parnia Bahar, Jan Rosendahl, Nick Rossenbach, Hermann Ney

This work describes the Neural Machine Translation (NMT) system of the RWTH Aachen University developed for the English$German tracks of the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2017.

Domain Adaptation Machine Translation +2

Detecting Various Types of Noise for Neural Machine Translation

no code implementations Findings (ACL) 2022 Christian Herold, Jan Rosendahl, Joris Vanvinckenroye, Hermann Ney

The filtering and/or selection of training data is one of the core aspects to be considered when building a strong machine translation system. In their influential work, Khayrallah and Koehn (2018) investigated the impact of different types of noise on the performance of machine translation systems. In the same year the WMT introduced a shared task on parallel corpus filtering, which went on to be repeated in the following years, and resulted in many different filtering approaches being proposed. In this work we aim to combine the recent achievements in data filtering with the original analysis of Khayrallah and Koehn (2018) and investigate whether state-of-the-art filtering systems are capable of removing all the suggested noise types. We observe that most of these types of noise can be detected with an accuracy of over 90% by modern filtering systems when operating in a well studied high resource setting. However, we also find that when confronted with more refined noise categories or when working with a less common language pair, the performance of the filtering systems is far from optimal, showing that there is still room for improvement in this area of research.

Machine Translation Translation

The RWTH Aachen Machine Translation System for IWSLT 2016

no code implementations IWSLT 2016 Jan-Thorsten Peter, Andreas Guta, Nick Rossenbach, Miguel Graça, Hermann Ney

This work describes the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of International Workshop on Spoken Language Translation (IWSLT) 2016.

Machine Translation Translation

Analysis of Positional Encodings for Neural Machine Translation

no code implementations EMNLP (IWSLT) 2019 Jan Rosendahl, Viet Anh Khoa Tran, Weiyue Wang, Hermann Ney

In this work we analyze and compare the behavior of the Transformer architecture when using different positional encoding methods.

Machine Translation Sentence +1

Towards a Better Evaluation of Metrics for Machine Translation

no code implementations WMT (EMNLP) 2020 Peter Stanchev, Weiyue Wang, Hermann Ney

An important aspect of machine translation is its evaluation, which can be achieved through the use of a variety of metrics.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.