Search Results for author: Alexander Waibel

Found 38 papers, 5 papers with code

German-Arabic Speech-to-Speech Translation for Psychiatric Diagnosis

no code implementations COLING (WANLP) 2020 Juan Hussain, Mohammed Mediani, Moritz Behr, M. Amin Cheragui, Sebastian Stüker, Alexander Waibel

As this is a very specific domain, in addition to the linguistic challenges posed by translating between Arabic and German, we also focus in this paper on the methods we implemented for adapting our speech translation system to the domain of this psychiatric interview.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Audio Segmentation for Robust Real-Time Speech Recognition Based on Neural Networks

no code implementations IWSLT 2016 Micha Wetzel, Matthias Sperber, Alexander Waibel

Speech that contains multimedia content can pose a serious challenge for real-time automatic speech recognition (ASR) for two reasons: (1) The ASR produces meaningless output, hurting the readability of the transcript.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Text-based Translation

no code implementations NAACL (SIGTYP) 2021 Zhong Zhou, Alexander Waibel

In other words, given a text in 124 source languages, we translate it into a severely low resource language using only ∼1, 000 lines of low resource data without any external help.

Effective combination of pretrained models - KIT@IWSLT2022

no code implementations IWSLT (ACL) 2022 Ngoc-Quan Pham, Tuan Nam Nguyen, Thai-Binh Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Alexander Waibel

Pretrained models in acoustic and textual modalities can potentially improve speech translation for both Cascade and End-to-end approaches.

Translation

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Multilingual Speech Translation KIT @ IWSLT2021

no code implementations ACL (IWSLT) 2021 Ngoc-Quan Pham, Tuan Nam Nguyen, Thanh-Le Ha, Sebastian Stüker, Alexander Waibel, Dan He

This paper contains the description for the submission of Karlsruhe Institute of Technology (KIT) for the multilingual TEDx translation task in the IWSLT 2021 evaluation campaign.

Translation

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

KIT’s Multilingual Neural Machine Translation systems for IWSLT 2017

no code implementations IWSLT 2017 Ngoc-Quan Pham, Matthias Sperber, Elizabeth Salesky, Thanh-Le Ha, Jan Niehues, Alexander Waibel

For the SLT track, in addition to a monolingual neural translation system used to generate correct punctuations and true cases of the data prior to training our multilingual system, we introduced a noise model in order to make our system more robust.

Machine Translation NMT +1

Convoifilter: A case study of doing cocktail party speech recognition

no code implementations22 Aug 2023 Thai-Binh Nguyen, Alexander Waibel

The model utilizes a single-channel speech enhancement module that isolates the speaker's voice from background noise (ConVoiFilter) and an ASR module.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Audio-driven Talking Face Generation by Overcoming Unintended Information Flow

no code implementations18 Jul 2023 Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Hazim Kemal Ekenel, Alexander Waibel

Specifically, this involves unintended flow of lip, pose and other information from the reference to the generated image, as well as instabilities during model training.

Audio-Visual Synchronization Talking Face Generation

KIT's Multilingual Speech Translation System for IWSLT 2023

1 code implementation8 Jun 2023 Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks.

Data Augmentation Retrieval +1

Towards continually learning new languages

no code implementations21 Nov 2022 Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training.

speech-recognition Speech Recognition +1

Code-Switching without Switching: Language Agnostic End-to-End Speech Translation

no code implementations4 Oct 2022 Christian Huber, Enes Yavuz Ugan, Alexander Waibel

We propose a) a Language Agnostic end-to-end Speech Translation model (LAST), and b) a data augmentation strategy to increase code-switching (CS) performance.

Data Augmentation speech-recognition +2

Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos

no code implementations9 Jun 2022 Alexander Waibel, Moritz Behr, Fevziye Irem Eyiokur, Dogucan Yaman, Tuan-Nam Nguyen, Carlos Mullov, Mehmet Arif Demirtas, Alperen Kantarcı, Stefan Constantin, Hazim Kemal Ekenel

The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Exposure Correction Model to Enhance Image Quality

1 code implementation22 Apr 2022 Fevziye Irem Eyiokur, Dogucan Yaman, Hazim Kemal Ekenel, Alexander Waibel

We show that after applying exposure correction with the proposed model, the portrait matting quality increases significantly.

Image Matting Low-Light Image Enhancement

Short-Term Word-Learning in a Dynamically Changing Environment

no code implementations29 Mar 2022 Christian Huber, Rishu Kumar, Ondřej Bojar, Alexander Waibel

In this paper we study, a) methods to acquire important words for this memory dynamically and, b) the trade-off between improvement in recognition accuracy of new words and the potential danger of false alarms for those added words.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition

1 code implementation5 Jul 2021 Christian Huber, Juan Hussain, Sebastian Stüker, Alexander Waibel

To alleviate this problem we supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Alpha Matte Generation from Single Input for Portrait Matting

no code implementations6 Jun 2021 Dogucan Yaman, Hazim Kemal Ekenel, Alexander Waibel

We first generate a coarse segmentation map from the input image and then predict the alpha matte by utilizing the image and segmentation map.

Image Matting Segmentation

Efficient Weight factorization for Multilingual Speech Recognition

no code implementations7 May 2021 Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stueker, Alexander Waibel

The key idea of the method is to assign fast weight matrices for each language by decomposing each weight matrix into a shared component and a language dependent component.

speech-recognition Speech Recognition

Unconstrained Face-Mask & Face-Hand Datasets: Building a Computer Vision System to Help Prevent the Transmission of COVID-19

2 code implementations16 Mar 2021 Fevziye Irem Eyiokur, Hazim Kemal Ekenel, Alexander Waibel

To train and evaluate the developed system, we collected and annotated images that represent face mask usage and face-hand interaction in the real world.

Relative Positional Encoding for Speech Recognition and Direct Translation

no code implementations20 May 2020 Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel

We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.

Position Sentence +4

Very Deep Self-Attention Networks for End-to-End Speech Recognition

no code implementations30 Apr 2019 Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel

Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.

speech-recognition Speech Recognition

Effective Strategies in Zero-Shot Neural Machine Translation

1 code implementation IWSLT 2017 Thanh-Le Ha, Jan Niehues, Alexander Waibel

In this paper, we proposed two strategies which can be applied to a multilingual neural machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus.

Machine Translation Translation

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

no code implementations IWSLT 2016 Thanh-Le Ha, Jan Niehues, Alexander Waibel

In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach.

Machine Translation NMT +1

Cannot find the paper you are looking for? You can Submit a new open access paper.