TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Multi-task Language Understanding	MMLU	GPT-NeoX 20B (5-shot)	Average (%)	33.6	# 86
Multi-task Language Understanding	MMLU	GPT-J 6B (zero-shot)	Average (%)	27.3	# 95
Multi-task Language Understanding	MMLU	GPT-NeoX 20B (0-shot)	Average (%)	28.6	# 92

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/gpt-neox-20b-an-open-source-autoregressive-1/multi-task-language-understanding-on-mmlu)](https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu?p=gpt-neox-20b-an-open-source-autoregressive-1)`

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

BigScience (ACL) 2022 · Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach ·

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license. It is, to the best of our knowledge, the largest dense autoregressive model that has publicly available weights at the time of submission. In this work, we describe \model{}'s architecture and training and evaluate its performance on a range of language-understanding, mathematics, and knowledge-based tasks. We find that GPT-NeoX-20B is a particularly powerful few-shot reasoner and gains far more in performance when evaluated five-shot than similarly sized GPT-3 and FairSeq models. We open-source the training and evaluation code, as well as the model weights, at https://github.com/EleutherAI/gpt-neox.

PDF Abstract BigScience (ACL) 2022 PDF BigScience (ACL) 2022 Abstract

Code

Add Remove Mark official

eleutherai/gpt-neox official

6,556

labmlai/annotated_deep_learning_pap…

↳ View annotated code at

labml.ai

47,519

labmlai/neox

110

2023-MindSpore-1/ms-code-13

2023-MindSpore-1/ms-code-153

Tasks

Add Remove

Language Modelling

Multi-task Language Understanding

Datasets

MMLU

HellaSwag

MATH

PIQA

The Pile

LAMBADA

LogiQA

PROST

Results from the Paper

Add Remove

Ranked #86 on Multi-task Language Understanding on MMLU

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Multi-task Language Understanding	MMLU	GPT-NeoX 20B (5-shot)	Average (%)	33.6	# 86	Compare
Multi-task Language Understanding	MMLU	GPT-J 6B (zero-shot)	Average (%)	27.3	# 95	Compare
Multi-task Language Understanding	MMLU	GPT-NeoX 20B (0-shot)	Average (%)	28.6	# 92	Compare

Methods

Add Remove

Adam • Attention Dropout • BPE • Cosine Annealing • Dense Connections • Dropout • Fixed Factorized Attention • GELU • GPT-3 • GPT-Neo • GPT-NeoX • Layer Normalization • Linear Layer • Linear Warmup With Cosine Annealing • Multi-Head Attention • Residual Connection • Scaled Dot-Product Attention • Softmax • Strided Attention • Weight Decay

Edit Social Preview

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove