Sequential Instructions

Introduced by Kirk et al. in Understanding the Effects of RLHF on LLM Generalisation and Diversity

This is the sequential instructions dataset from Understanding the Effects of RLHF on LLM Generalisation and Diversity. The dataset is in the alpaca_eval format.

For information about how the dataset was generated, see https://github.com/RobertKirk/stanford_alpaca.

The instructions in the dataset generally have a sequence of steps we expect the model to complete all at once. In our work, we found that RLHF models generalise much better to this dataset than SFT models when trained on the AlpacaFarm datasets.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Instruction Following

Similar Datasets

AlpacaEval

TLDR9+

Usage

License

Unknown

Modalities

Texts

Languages

English

Sequential Instructions

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit