Multi-task Language Understanding

32 papers with code • 4 benchmarks • 5 datasets

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Libraries

Use these libraries to find Multi-task Language Understanding models and implementations
2 papers
58,443
See all 10 libraries.

Most implemented papers

RoBERTa: A Robustly Optimized BERT Pretraining Approach

pytorch/fairseq 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Language Models are Few-Shot Learners

openai/gpt-3 NeurIPS 2020

By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do.

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

google-research/ALBERT ICLR 2020

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.

LLaMA: Open and Efficient Foundation Language Models

facebookresearch/llama arXiv 2023

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

Language Models are Unsupervised Multitask Learners

openai/gpt-2 Preprint 2019

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

Llama 2: Open Foundation and Fine-Tuned Chat Models

facebookresearch/llama 18 Jul 2023

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Evaluating Large Language Models Trained on Code

openai/human-eval 7 Jul 2021

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.

Measuring Massive Multitask Language Understanding

hendrycks/test 7 Sep 2020

By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings.

GLM-130B: An Open Bilingual Pre-trained Model

thudm/glm-130b 5 Oct 2022

We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.