Search Results for author: Jean-François Kagy

Found 4 papers, 1 papers with code

Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes

1 code implementation • 8 Feb 2024 • Lucio Dery, Steven Kolawole, Jean-François Kagy, Virginia Smith, Graham Neubig, Ameet Talwalkar

Given the generational gap in available hardware between lay practitioners and the most endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size.

Paper
Code

SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection

no code implementations • 24 Jan 2024 • Ke Ye, Heinrich Jiang, Afshin Rostamizadeh, Ayan Chakrabarti, Giulia Desalvo, Jean-François Kagy, Lazaros Karydas, Gui Citovsky, Sanjiv Kumar

In this paper, we present SpacTor, a new training procedure consisting of (1) a hybrid objective combining span corruption (SC) and token replacement detection (RTD), and (2) a two-stage curriculum that optimizes the hybrid objective over the initial $\tau$ iterations, then transitions to standard SC loss.

Paper
Add Code

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

no code implementations • 12 Oct 2023 • Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, Rishabh Agarwal

Finally, in practical scenarios with models of varying sizes, first using distillation to boost the performance of the target model and then applying DistillSpec to train a well-aligned draft model can reduce decoding latency by 6-10x with minimal performance drop, compared to standard decoding without distillation.

Knowledge Distillation Language Modelling +1

Paper
Add Code

The Practical Challenges of Active Learning: Lessons Learned from Live Experimentation

no code implementations • 28 Jun 2019 • Jean-François Kagy, Tolga Kayadelen, Ji Ma, Afshin Rostamizadeh, Jana Strnadova

We tested in a live setting the use of active learning for selecting text sentences for human annotations used in training a Thai segmentation machine learning model.

Active Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.