Search Results for author: Sid Black

Found 3 papers, 2 papers with code

Interpreting Neural Networks through the Polytope Lens

no code implementations22 Nov 2022 Sid Black, Lee Sharkey, Leo Grinsztajn, Eric Winsor, Dan Braun, Jacob Merizian, Kip Parker, Carlos Ramón Guevara, Beren Millidge, Gabriel Alfour, Connor Leahy

Previous mechanistic descriptions have used individual neurons or their linear combinations to understand the representations a network has learned.

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

10 code implementations BigScience (ACL) 2022 Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license.

Ranked #95 on Multi-task Language Understanding on MMLU (using extra training data)

Language Modeling Language Modelling +1

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

21 code implementations31 Dec 2020 Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, Connor Leahy

Recent work has demonstrated that increased training dataset diversity improves general cross-domain knowledge and downstream generalization capability for large-scale language models.

Diversity Language Modeling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.