Search Results for author: Sedrick Keh

Found 3 papers, 2 papers with code

Language models scale reliably with over-training and on downstream tasks

1 code implementation • 13 Mar 2024 • Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Alexandros G. Dimakis, Gabriel Ilharco, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt

We fit scaling laws that extrapolate in both the number of model parameters and the ratio of training tokens to parameters.

Language Modelling

Paper
Code

A Critical Evaluation of AI Feedback for Aligning Large Language Models

1 code implementation • 19 Feb 2024 • Archit Sharma, Sedrick Keh, Eric Mitchell, Chelsea Finn, Kushal Arora, Thomas Kollar

RLAIF first performs supervised fine-tuning (SFT) using demonstrations from a teacher model and then further fine-tunes the model with reinforcement learning (RL), using feedback from a critic model.

Instruction Following reinforcement-learning +1

Paper
Code

Asking More Informative Questions for Grounded Retrieval

no code implementations • 14 Nov 2023 • Sedrick Keh, Justin T. Chiu, Daniel Fried

When a model is trying to gather information in an interactive setting, it benefits from asking informative questions.

Question Answering Question Selection +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.