Search Results for author: Catalin Zorila

Found 10 papers, 1 papers with code

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition

no code implementations24 Apr 2023 Mohan Li, Rama Doddipatla, Catalin Zorila

In previous works, latency was optimised by truncating the online attention weights based on the hard alignments obtained from conventional ASR models, without taking into account the potential loss of ASR accuracy.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Transformer-based Streaming ASR with Cumulative Attention

no code implementations11 Mar 2022 Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla

In this paper, we propose an online attention mechanism, known as cumulative attention (CA), for streaming Transformer-based automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Monaural source separation: From anechoic to reverberant environments

no code implementations15 Nov 2021 Tobias Cord-Landwehr, Christoph Boeddeker, Thilo von Neumann, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach

Impressive progress in neural network-based single-channel speech source separation has been made in recent years.

Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation

no code implementations15 Jun 2021 Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

The proposed method first uses mixtures of unseparated sources and the mixture invariant training (MixIT) criterion to train a teacher model.

Speech Separation

Head-synchronous Decoding for Transformer-based Streaming ASR

no code implementations26 Apr 2021 Mohan Li, Catalin Zorila, Rama Doddipatla

Online Transformer-based automatic speech recognition (ASR) systems have been extensively studied due to the increasing demand for streaming applications.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism

no code implementations7 Feb 2021 Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

In this paper, we present a novel multi-channel speech extraction system to simultaneously extract multiple clean individual sources from a mixture in noisy and reverberant environments.

Speech Extraction speech-recognition +1

On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments

no code implementations11 Nov 2020 Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker

To reduce the influence of reverberation on spatial feature extraction, a dereverberation pre-processing method has been applied to further improve the separation performance.

speech-recognition Speech Recognition +1

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription

1 code implementation26 Sep 2019 Catalin Zorila, Christoph Boeddeker, Rama Doddipatla, Reinhold Haeb-Umbach

Despite the strong modeling power of neural network acoustic models, speech enhancement has been shown to deliver additional word error rate improvements if multi-channel data is available.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.