Multi-task Language Understanding

30 papers with code • 4 benchmarks • 5 datasets

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Libraries

Use these libraries to find Multi-task Language Understanding models and implementations
2 papers
53,071
See all 10 libraries.

Most implemented papers

RoBERTa: A Robustly Optimized BERT Pretraining Approach

pytorch/fairseq 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

google-research/ALBERT ICLR 2020

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.

Language Models are Few-Shot Learners

openai/gpt-3 NeurIPS 2020

By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do.

LLaMA: Open and Efficient Foundation Language Models

facebookresearch/llama arXiv 2023

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

Language Models are Unsupervised Multitask Learners

openai/gpt-2 Preprint 2019

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

Evaluating Large Language Models Trained on Code

openai/human-eval 7 Jul 2021

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.

Llama 2: Open Foundation and Fine-Tuned Chat Models

facebookresearch/llama 18 Jul 2023

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Measuring Massive Multitask Language Understanding

hendrycks/test 7 Sep 2020

By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings.

GLM-130B: An Open Bilingual Pre-trained Model

thudm/glm-130b 5 Oct 2022

We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.