Search Results for author: Lysandre Debut

Found 4 papers, 4 papers with code

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

32 code implementations NeurIPS 2019 Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

Ranked #2 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (Wasserstein Distance (WD) metric, using extra training data)

Hate Speech Detection Knowledge Distillation +8

Cannot find the paper you are looking for? You can Submit a new open access paper.