Search Results for author: Puming Zhan

Found 7 papers, 0 papers with code

Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems

no code implementations29 Jun 2022 Jesús Andrés-Ferrer, Dario Albesano, Puming Zhan, Paul Vozila

In this work, we propose a contextual density ratio approach for both training a contextual aware E2E model and adapting the language model to named entities.

Language Modelling

On the Prediction Network Architecture in RNN-T for ASR

no code implementations29 Jun 2022 Dario Albesano, Jesús Andrés-Ferrer, Nicola Ferri, Puming Zhan

In contrast to some previous works, our results show that Transformer does not always outperform LSTM when used as prediction network along with Conformer encoder.

Conformer with dual-mode chunked attention for joint online and offline ASR

no code implementations22 Jun 2022 Felix Weninger, Marco Gaudesi, Md Akmal Haidar, Nicola Ferri, Jesús Andrés-Ferrer, Puming Zhan

In the dual-mode Conformer Transducer model, layers can function in online or offline mode while sharing parameters, and in-place knowledge distillation from offline to online mode is applied in training to improve online accuracy.

Knowledge Distillation

ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization

no code implementations23 Sep 2021 Marco Gaudesi, Felix Weninger, Dushyant Sharma, Puming Zhan

End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field ASR tasks by joint training of a multi-channel front-end along with the ASR model.

Data Augmentation

Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition

no code implementations17 Sep 2021 Felix Weninger, Marco Gaudesi, Ralf Leibold, Roberto Gemello, Puming Zhan

We use a single-channel encoder for CT speech and a multi-channel encoder with Spatial Filtering neural beamforming for FT speech, which are jointly trained with the encoder selection.

speech-recognition Speech Recognition

Semi-Supervised Learning with Data Augmentation for End-to-End ASR

no code implementations27 Jul 2020 Felix Weninger, Franco Mana, Roberto Gemello, Jesús Andrés-Ferrer, Puming Zhan

In the result, the Noisy Student algorithm with soft labels and consistency regularization achieves 10. 4% word error rate (WER) reduction when adding 475h of unlabeled data, corresponding to a recovery rate of 92%.

Data Augmentation Image Classification +1

Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence ASR

no code implementations8 Jul 2019 Felix Weninger, Jesús Andrés-Ferrer, Xinwei Li, Puming Zhan

Sequence-to-sequence (seq2seq) based ASR systems have shown state-of-the-art performances while having clear advantages in terms of simplicity.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.