no code implementations • EMNLP 2021 • Yimeng Wu, Mehdi Rezagholizadeh, Abbas Ghaddar, Md Akmal Haidar, Ali Ghodsi
Intermediate layer matching is shown as an effective approach for improving knowledge distillation (KD).
no code implementations • Findings (EMNLP) 2021 • Peng Lu, Abbas Ghaddar, Ahmad Rashid, Mehdi Rezagholizadeh, Ali Ghodsi, Philippe Langlais
Knowledge Distillation (KD) is extensively used in Natural Language Processing to compress the pre-training and task-specific fine-tuning phases of large neural language models.
1 code implementation • 15 Jan 2024 • Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Boxing Chen
Pretraining monolingual language models have been proven to be vital for performance in Arabic Natural Language Processing (NLP) tasks.
no code implementations • 11 Jun 2023 • Asaad Alghamdi, Xinyu Duan, Wei Jiang, Zhenhai Wang, Yimeng Wu, Qingrong Xia, Zhefeng Wang, Yi Zheng, Mehdi Rezagholizadeh, Baoxing Huai, Peilun Cheng, Abbas Ghaddar
Developing monolingual large Pre-trained Language Models (PLMs) is shown to be very successful in handling different tasks in Natural Language Processing (NLP).
no code implementations • 21 May 2022 • Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language.
no code implementations • COLING 2022 • Md Akmal Haidar, Mehdi Rezagholizadeh, Abbas Ghaddar, Khalil Bibi, Philippe Langlais, Pascal Poupart
Knowledge distillation (KD) is an efficient framework for compressing large-scale pre-trained language models.
1 code implementation • 8 Dec 2021 • Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.
no code implementations • 9 Nov 2021 • David Alfonso-Hermelo, Ahmad Rashid, Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh
We apply NATURE to common slot-filling and intent detection benchmarks and demonstrate that simple perturbations from the standard evaluation set by NATURE can deteriorate model performance significantly.
no code implementations • Findings (NAACL) 2022 • Md Akmal Haidar, Nithin Anchuri, Mehdi Rezagholizadeh, Abbas Ghaddar, Philippe Langlais, Pascal Poupart
To address these problems, we propose a RAndom Intermediate Layer Knowledge Distillation (RAIL-KD) approach in which, intermediate layers from the teacher model are selected randomly to be distilled into the intermediate layers of the student model.
no code implementations • WNUT (ACL) 2021 • Shivendra Bhardwaj, Abbas Ghaddar, Ahmad Rashid, Khalil Bibi, Chengyang Li, Ali Ghodsi, Philippe Langlais, Mehdi Rezagholizadeh
Knowledge Distillation (KD) is extensively used to compress and deploy large pre-trained language models on edge devices for real-world applications.
no code implementations • Findings (ACL) 2021 • Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Ahmad Rashid
Existing Natural Language Understanding (NLU) models have been shown to incorporate dataset biases leading to strong performance on in-distribution (ID) test sets but poor performance on out-of-distribution (OOD) ones.
no code implementations • 24 Jul 2021 • Abbas Ghaddar, Philippe Langlais, Ahmad Rashid, Mehdi Rezagholizadeh
In this work, we examine the ability of NER models to use contextual information when predicting the type of an ambiguous entity.
no code implementations • EMNLP 2021 • Ahmad Rashid, Vasileios Lioutas, Abbas Ghaddar, Mehdi Rezagholizadeh
Knowledge Distillation (KD) is a common knowledge transfer algorithm used for model compression across a variety of deep learning based natural language processing (NLP) solutions.
1 code implementation • LREC 2020 • Abbas Ghaddar, Phillippe Langlais
This paper describes the acquisition, preprocessing and characteristics of SEDAR, a large scale English-French parallel corpus for the financial domain.
no code implementations • WS 2019 • Abbas Ghaddar, Phillippe Langlais
We describe a special type of deep contextualized word representation that is learned from distant supervision annotations and dedicated to named entity recognition.
1 code implementation • COLING 2018 • Abbas Ghaddar, Philippe Langlais
While some features do remain in state-of-the-art systems, lexical features have been mostly discarded, with the exception of gazetteers.
Ranked #22 on Named Entity Recognition (NER) on Ontonotes v5 (English) (using extra training data)
no code implementations • IJCNLP 2017 • Abbas Ghaddar, Phillippe Langlais
We revisit the idea of mining Wikipedia in order to generate named-entity annotations.
no code implementations • LREC 2016 • Abbas Ghaddar, Phillippe Langlais
This paper presents WikiCoref, an English corpus annotated for anaphoric relations, where all documents are from the English version of Wikipedia.