Search Results for author: Mohammad Zeineldeen

Found 15 papers, 2 papers with code

Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR

no code implementations • 23 Feb 2024 • Jintao Jiang, Yingbo Gao, Mohammad Zeineldeen, Zoltan Tuske

In this paper, alternating weak triphone/BPE alignment supervision is proposed to improve end-to-end model training.

Paper
Add Code

Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition

no code implementations • 15 Sep 2023 • Mohammad Zeineldeen, Albert Zeyer, Ralf Schlüter, Hermann Ney

We study a streamable attention-based encoder-decoder model in which either the decoder, or both the encoder and decoder, operate on pre-defined, fixed-size windows called chunks.

speech-recognition Speech Recognition

Paper
Add Code

Improving Language Model Integration for Neural Machine Translation

no code implementations • 8 Jun 2023 • Christian Herold, Yingbo Gao, Mohammad Zeineldeen, Hermann Ney

The integration of language models for neural machine translation has been extensively studied in the past.

Automatic Speech Recognition Language Modelling +4

Paper
Add Code

Take the Hint: Improving Arabic Diacritization with Partially-Diacritized Text

1 code implementation • 6 Jun 2023 • Parnia Bahar, Mattia Di Gangi, Nick Rossenbach, Mohammad Zeineldeen

Automatic Arabic diacritization is useful in many applications, ranging from reading support for language learners to accurate pronunciation predictor for downstream tasks like speech synthesis.

Speech Synthesis

Paper
Code

Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss

no code implementations • 10 Mar 2023 • Mohammad Zeineldeen, Kartik Audhkhasi, Murali Karthick Baskar, Bhuvana Ramabhadran

Soft distillation is another popular KD method that distills the output logits of the teacher model.

Knowledge Distillation

Paper
Add Code

Analyzing And Improving Neural Speaker Embeddings for ASR

no code implementations • 11 Jan 2023 • Christoph Lüscher, Jingjing Xu, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

By further adding neural speaker embeddings, we gain additional ~3% relative WER improvement on Hub5'00.

Speaker Verification

Paper
Add Code

Enhancing and Adversarial: Improve ASR with Speaker Labels

no code implementations • 11 Nov 2022 • Wei Zhou, Haotian Wu, Jingjing Xu, Mohammad Zeineldeen, Christoph Lüscher, Ralf Schlüter, Hermann Ney

Detailed analysis and experimental verification are conducted to show the optimal positions in the ASR neural network (NN) to apply speaker enhancing and adversarial training.

Multi-Task Learning

Paper
Add Code

Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

no code implementations • 24 Oct 2022 • Christoph Lüscher, Mohammad Zeineldeen, Zijian Yang, Tina Raissi, Peter Vieting, Khai Le-Duc, Weiyue Wang, Ralf Schlüter, Hermann Ney

Language barriers present a great challenge in our increasingly connected and global world.

Translation

Paper
Add Code

Improving the Training Recipe for a Robust Conformer-based Hybrid Model

no code implementations • 26 Jun 2022 • Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Ralf Schlüter, Hermann Ney

In this work, we investigate various methods for speaker adaptive training (SAT) based on feature-space approaches for a conformer-based acoustic model (AM) on the Switchboard 300h dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Conformer-based Hybrid ASR System for Switchboard Dataset

no code implementations • 5 Nov 2021 • Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Wilfried Michel, Alexander Gerstenberger, Ralf Schlüter, Hermann Ney

The recently proposed conformer architecture has been successfully used for end-to-end automatic speech recognition (ASR) architectures achieving state-of-the-art performance on different datasets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Automatic Learning of Subword Dependent Model Scales

no code implementations • 18 Oct 2021 • Felix Meyer, Wilfried Michel, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

We show on the LibriSpeech (LBS) and Switchboard (SWB) corpora that the model scales for a combination of attentionbased encoder-decoder acoustic model and language model can be learned as effectively as with manual tuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition

no code implementations • 19 Apr 2021 • Wei Zhou, Mohammad Zeineldeen, Zuoyun Zheng, Ralf Schlüter, Hermann Ney

Subword units are commonly used for end-to-end automatic speech recognition (ASR), while a fully acoustic-oriented subword modeling approach is somewhat missing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models

no code implementations • 12 Apr 2021 • Mohammad Zeineldeen, Aleksandr Glushko, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney

Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions.

Language Modelling

Paper
Add Code

Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures

no code implementations • 12 Apr 2021 • Nick Rossenbach, Mohammad Zeineldeen, Benedikt Hilmes, Ralf Schlüter, Hermann Ney

We achieve a final word-error-rate of 3. 3%/10. 0% with a hybrid system on the clean/noisy test-sets, surpassing any previous state-of-the-art systems on Librispeech-100h that do not include unlabeled audio data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models

1 code implementation • 19 May 2020 • Mohammad Zeineldeen, Albert Zeyer, Wei Zhou, Thomas Ng, Ralf Schlüter, Hermann Ney

Following the rationale of end-to-end modeling, CTC, RNN-T or encoder-decoder-attention models for automatic speech recognition (ASR) use graphemes or grapheme-based subword units based on e. g. byte-pair encoding (BPE).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

151

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.