Search Results for author: Philippe Langlais

Found 32 papers, 5 papers with code

Unsupervised multiple-choice question generation for out-of-domain Q&A fine-tuning

no code implementations ACL 2022 Guillaume Le Berre, Christophe Cerisara, Philippe Langlais, Guy Lapalme

Pre-trained models have shown very good performances on a number of question answering benchmarks especially when fine-tuned on multiple question answering datasets at once.

Multiple-choice Question Answering +2

RW-KD: Sample-wise Loss Terms Re-Weighting for Knowledge Distillation

no code implementations Findings (EMNLP) 2021 Peng Lu, Abbas Ghaddar, Ahmad Rashid, Mehdi Rezagholizadeh, Ali Ghodsi, Philippe Langlais

Knowledge Distillation (KD) is extensively used in Natural Language Processing to compress the pre-training and task-specific fine-tuning phases of large neural language models.

Knowledge Distillation

Refining an Almost Clean Translation Memory Helps Machine Translation

no code implementations AMTA 2022 Shivendra Bhardwa, David Alfonso-Hermelo, Philippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard

While recent studies have been dedicated to cleaning very noisy parallel corpora to improve Machine Translation training, we focus in this work on filtering a large and mostly clean Translation Memory.

Machine Translation Translation

Exploiting Domain-Specific Knowledge for Judgment Prediction Is No Panacea

no code implementations RANLP 2021 Olivier Salaün, Philippe Langlais, Karim Benyekhlef

Legal judgment prediction (LJP) usually consists in a text classification task aimed at predicting the verdict on the basis of the fact description.

Legal Reasoning text-classification +1

EUROPA: A Legal Multilingual Keyphrase Generation Dataset

no code implementations1 Mar 2024 Olivier Salaün, Frédéric Piedboeuf, Guillaume Le Berre, David Alfonso Hermelo, Philippe Langlais

Keyphrase generation has primarily been explored within the context of academic research articles, with a particular focus on scientific domains and the English language.

Keyphrase Generation

Data Augmentation is Dead, Long Live Data Augmentation

no code implementations22 Feb 2024 Frédéric Piedboeuf, Philippe Langlais

Textual data augmentation (DA) is a prolific field of study where novel techniques to create artificial data are regularly proposed, and that has demonstrated great efficiency on small data settings, at least for text classification tasks.

Data Augmentation text-classification +1

On the importance of Data Scale in Pretraining Arabic Language Models

1 code implementation15 Jan 2024 Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Boxing Chen

Pretraining monolingual language models have been proven to be vital for performance in Arabic Natural Language Processing (NLP) tasks.

Language Modelling

LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization

no code implementations8 May 2023 Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Philippe Langlais

Label Smoothing (LS) is another simple, versatile and efficient regularization which can be applied to various supervised classification tasks.

Image Classification Machine Translation

JABER and SABER: Junior and Senior Arabic BERt

1 code implementation8 Dec 2021 Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.

Language Modelling NER

NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation

no code implementations9 Nov 2021 David Alfonso-Hermelo, Ahmad Rashid, Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh

We apply NATURE to common slot-filling and intent detection benchmarks and demonstrate that simple perturbations from the standard evaluation set by NATURE can deteriorate model performance significantly.

Intent Detection slot-filling +1

Pseudo Knowledge Distillation: Towards Learning Optimal Instance-specific Label Smoothing Regularization

no code implementations29 Sep 2021 Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Philippe Langlais

Knowledge Distillation (KD) is an algorithm that transfers the knowledge of a trained, typically larger, neural network into another model under training.

Image Classification Knowledge Distillation +1

RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation

no code implementations Findings (NAACL) 2022 Md Akmal Haidar, Nithin Anchuri, Mehdi Rezagholizadeh, Abbas Ghaddar, Philippe Langlais, Pascal Poupart

To address these problems, we propose a RAndom Intermediate Layer Knowledge Distillation (RAIL-KD) approach in which, intermediate layers from the teacher model are selected randomly to be distilled into the intermediate layers of the student model.

Knowledge Distillation

End-to-End Self-Debiasing Framework for Robust NLU Training

no code implementations Findings (ACL) 2021 Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Ahmad Rashid

Existing Natural Language Understanding (NLU) models have been shown to incorporate dataset biases leading to strong performance on in-distribution (ID) test sets but poor performance on out-of-distribution (OOD) ones.

Natural Language Understanding

Robust Lexical Features for Improved Neural Network Named-Entity Recognition

1 code implementation COLING 2018 Abbas Ghaddar, Philippe Langlais

While some features do remain in state-of-the-art systems, lexical features have been mostly discarded, with the exception of gazetteers.

Ranked #22 on Named Entity Recognition (NER) on Ontonotes v5 (English) (using extra training data)

named-entity-recognition Named Entity Recognition +1

A Deep Neural Network Approach To Parallel Sentence Extraction

no code implementations28 Sep 2017 Francis Grégoire, Philippe Langlais

Parallel sentence extraction is a task addressing the data sparsity problem found in multilingual natural language processing applications.

Feature Engineering Machine Translation +3

Translating Implicit Discourse Connectives Based on Cross-lingual Annotation and Alignment

no code implementations WS 2017 Hongzheng Li, Philippe Langlais, Yaohong Jin

Implicit discourse connectives and relations are distributed more widely in Chinese texts, when translating into English, such connectives are usually translated explicitly.

Implicit Relations Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.