Search Results for author: Abbas Ghaddar

Found 25 papers, 7 papers with code

RW-KD: Sample-wise Loss Terms Re-Weighting for Knowledge Distillation

no code implementations Findings (EMNLP) 2021 Peng Lu, Abbas Ghaddar, Ahmad Rashid, Mehdi Rezagholizadeh, Ali Ghodsi, Philippe Langlais

Knowledge Distillation (KD) is extensively used in Natural Language Processing to compress the pre-training and task-specific fine-tuning phases of large neural language models.

Knowledge Distillation

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

1 code implementation19 Sep 2024 Jiaming Zhou, Abbas Ghaddar, Ge Zhang, Liheng Ma, Yaochen Hu, Soumyasundar Pal, Mark Coates, Bin Wang, Yingxue Zhang, Jianye Hao

Despite recent advances in training and prompting strategies for Large Language Models (LLMs), these models continue to face challenges with complex logical reasoning tasks that involve long reasoning chains.

Logical Reasoning Spatial Reasoning

CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search

1 code implementation7 Jun 2024 Fengran Mo, Abbas Ghaddar, Kelong Mao, Mehdi Rezagholizadeh, Boxing Chen, Qun Liu, Jian-Yun Nie

In this paper, we study how open-source large language models (LLMs) can be effectively deployed for improving query rewriting in conversational search, especially for ambiguous queries.

Conversational Search

OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection

1 code implementation4 Jun 2024 Chenyang Huang, Abbas Ghaddar, Ivan Kobyzev, Mehdi Rezagholizadeh, Osmar R. Zaiane, Boxing Chen

In this work, we introduce OTTAWA, a novel Optimal Transport (OT)-based word aligner specifically designed to enhance the detection of hallucinations and omissions in MT systems.

Hallucination Machine Translation +2

On the importance of Data Scale in Pretraining Arabic Language Models

1 code implementation15 Jan 2024 Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Boxing Chen

Pretraining monolingual language models have been proven to be vital for performance in Arabic Natural Language Processing (NLP) tasks.

Decoder Language Modeling +1

AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing

no code implementations11 Jun 2023 Asaad Alghamdi, Xinyu Duan, Wei Jiang, Zhenhai Wang, Yimeng Wu, Qingrong Xia, Zhefeng Wang, Yi Zheng, Mehdi Rezagholizadeh, Baoxing Huai, Peilun Cheng, Abbas Ghaddar

Developing monolingual large Pre-trained Language Models (PLMs) is shown to be very successful in handling different tasks in Natural Language Processing (NLP).

Few-Shot Learning

JABER and SABER: Junior and Senior Arabic BERt

1 code implementation8 Dec 2021 Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.

Language Modeling Language Modelling +1

NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation

no code implementations9 Nov 2021 David Alfonso-Hermelo, Ahmad Rashid, Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh

We apply NATURE to common slot-filling and intent detection benchmarks and demonstrate that simple perturbations from the standard evaluation set by NATURE can deteriorate model performance significantly.

Intent Detection slot-filling +1

RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation

no code implementations Findings (NAACL) 2022 Md Akmal Haidar, Nithin Anchuri, Mehdi Rezagholizadeh, Abbas Ghaddar, Philippe Langlais, Pascal Poupart

To address these problems, we propose a RAndom Intermediate Layer Knowledge Distillation (RAIL-KD) approach in which, intermediate layers from the teacher model are selected randomly to be distilled into the intermediate layers of the student model.

Knowledge Distillation

End-to-End Self-Debiasing Framework for Robust NLU Training

no code implementations Findings (ACL) 2021 Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Ahmad Rashid

Existing Natural Language Understanding (NLU) models have been shown to incorporate dataset biases leading to strong performance on in-distribution (ID) test sets but poor performance on out-of-distribution (OOD) ones.

Natural Language Understanding

Towards Zero-Shot Knowledge Distillation for Natural Language Processing

no code implementations EMNLP 2021 Ahmad Rashid, Vasileios Lioutas, Abbas Ghaddar, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is a common knowledge transfer algorithm used for model compression across a variety of deep learning based natural language processing (NLP) solutions.

Knowledge Distillation Model Compression +1

SEDAR: a Large Scale French-English Financial Domain Parallel Corpus

1 code implementation LREC 2020 Abbas Ghaddar, Phillippe Langlais

This paper describes the acquisition, preprocessing and characteristics of SEDAR, a large scale English-French parallel corpus for the financial domain.

Domain Adaptation Machine Translation +2

Contextualized Word Representations from Distant Supervision with and for NER

no code implementations WS 2019 Abbas Ghaddar, Phillippe Langlais

We describe a special type of deep contextualized word representation that is learned from distant supervision annotations and dedicated to named entity recognition.

named-entity-recognition Named Entity Recognition +1

Robust Lexical Features for Improved Neural Network Named-Entity Recognition

1 code implementation COLING 2018 Abbas Ghaddar, Philippe Langlais

While some features do remain in state-of-the-art systems, lexical features have been mostly discarded, with the exception of gazetteers.

Ranked #23 on Named Entity Recognition (NER) on Ontonotes v5 (English) (using extra training data)

named-entity-recognition Named Entity Recognition +1

WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles

no code implementations LREC 2016 Abbas Ghaddar, Phillippe Langlais

This paper presents WikiCoref, an English corpus annotated for anaphoric relations, where all documents are from the English version of Wikipedia.

Articles coreference-resolution

Cannot find the paper you are looking for? You can Submit a new open access paper.