2 code implementations • 13 Jan 2022 • Stella Biderman, Kieran Bicheno, Leo Gao
This datasheet describes the Pile, a 825 GiB dataset of human-authored text compiled by EleutherAI for use in large-scale language modeling.
Language Modelling