Search Results for author: Victor Sanh

Found 12 papers, 10 papers with code

Block Pruning For Faster Transformers

1 code implementation10 Sep 2021 François Lagunas, Ella Charlaix, Victor Sanh, Alexander M. Rush

Pre-training has improved model accuracy for both classification and generation tasks at the cost of introducing much larger and slower models.

Classification Machine Translation +1

Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning

1 code implementation9 Sep 2021 Prasetya Ajie Utama, Nafise Sadat Moosavi, Victor Sanh, Iryna Gurevych

Recent prompt-based approaches allow pretrained language models to achieve strong performances on few-shot finetuning by reformulating downstream tasks as a language modeling problem.

Language Modelling

Low-Complexity Probing via Finding Subnetworks

1 code implementation NAACL 2021 Steven Cao, Victor Sanh, Alexander M. Rush

The dominant approach in probing neural networks for linguistic properties is to train a new shallow multi-layer perceptron (MLP) on top of the model's internal representations.

Learning from others' mistakes: Avoiding dataset biases without modeling them

no code implementations ICLR 2021 Victor Sanh, Thomas Wolf, Yonatan Belinkov, Alexander M. Rush

State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended underlying task.

Movement Pruning: Adaptive Sparsity by Fine-Tuning

3 code implementations NeurIPS 2020 Victor Sanh, Thomas Wolf, Alexander M. Rush

Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning; however, it is less effective in the transfer learning regime that has become standard for state-of-the-art natural language processing applications.

Network Pruning Transfer Learning

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

18 code implementations NeurIPS 2019 Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

Hate Speech Detection Knowledge Distillation +7

TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents

17 code implementations23 Jan 2019 Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue

We introduce a new approach to generative data-driven dialogue systems (e. g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model.

Dialogue Generation Information Retrieval +1

A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

1 code implementation14 Nov 2018 Victor Sanh, Thomas Wolf, Sebastian Ruder

The model is trained in a hierarchical fashion to introduce an inductive bias by supervising a set of low level tasks at the bottom layers of the model and more complex tasks at the top layers of the model.

Ranked #9 on Relation Extraction on ACE 2005 (using extra training data)

Multi-Task Learning Named Entity Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.