Search Results for author: Md Akmal Haidar

Found 8 papers, 0 papers with code

Universal-KD: Attention-based Output-Grounded Intermediate Layer Knowledge Distillation

no code implementations • EMNLP 2021 • Yimeng Wu, Mehdi Rezagholizadeh, Abbas Ghaddar, Md Akmal Haidar, Ali Ghodsi

Intermediate layer matching is shown as an effective approach for improving knowledge distillation (KD).

Paper
Add Code

Conformer with dual-mode chunked attention for joint online and offline ASR

no code implementations • 22 Jun 2022 • Felix Weninger, Marco Gaudesi, Md Akmal Haidar, Nicola Ferri, Jesús Andrés-Ferrer, Puming Zhan

In the dual-mode Conformer Transducer model, layers can function in online or offline mode while sharing parameters, and in-place knowledge distillation from offline to online mode is applied in training to improve online accuracy.

Knowledge Distillation

Paper
Add Code

CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge Distillation

no code implementations • COLING 2022 • Md Akmal Haidar, Mehdi Rezagholizadeh, Abbas Ghaddar, Khalil Bibi, Philippe Langlais, Pascal Poupart

Knowledge distillation (KD) is an efficient framework for compressing large-scale pre-trained language models.

Contrastive Learning Data Augmentation +1

Paper
Add Code

RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation

no code implementations • Findings (NAACL) 2022 • Md Akmal Haidar, Nithin Anchuri, Mehdi Rezagholizadeh, Abbas Ghaddar, Philippe Langlais, Pascal Poupart

To address these problems, we propose a RAndom Intermediate Layer Knowledge Distillation (RAIL-KD) approach in which, intermediate layers from the teacher model are selected randomly to be distilled into the intermediate layers of the student model.

Knowledge Distillation

Paper
Add Code

Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation

no code implementations • 17 Mar 2021 • Md Akmal Haidar, Chao Xing, Mehdi Rezagholizadeh

End-to-end automatic speech recognition (ASR), unlike conventional ASR, does not have modules to learn the semantic representation from speech encoder.

Ranked #12 on Speech Recognition on LibriSpeech test-clean

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Fine-tuning of Pre-trained End-to-end Speech Recognition with Generative Adversarial Networks

no code implementations • 10 Mar 2021 • Md Akmal Haidar, Mehdi Rezagholizadeh

In this paper, we introduce a novel framework for fine-tuning a pre-trained ASR model using the GAN objective where the ASR model acts as a generator and a discriminator tries to distinguish the ASR output from the real data.

speech-recognition Speech Recognition