RoFT is a dataset of 21,000 human annotations of generated text. The task is "Boundary detection" i.e. given a passage that starts off as human written, determine when the text transitions to being machine generated. The dataset also includes error annotations using the taxonomy introduced in the paper. The data can be used to train automatic detection systems, train automatic error correction, analyze visibility of model errors, and compare performance across models. Data was collected using http://roft.io.

Models: GPT2, GPT2-XL, CTRL, GPT3 "Davinci"

Genres: News, Stories, Recipes, Speeches

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • MIT

Modalities


Languages