Multi-task Language Understanding

32 papers with code • 4 benchmarks • 5 datasets

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Libraries

Use these libraries to find Multi-task Language Understanding models and implementations
2 papers
56,574
See all 10 libraries.

Most implemented papers

Scaling Instruction-Finetuned Language Models

google-research/flan 20 Oct 2022

We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation).

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

eleutherai/gpt-neox BigScience (ACL) 2022

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license.

Mistral 7B

mistralai/mistral-src 10 Oct 2023

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.

Mixtral of Experts

hit-scir/chinese-mixtral-8x7b 8 Jan 2024

In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks.

UnifiedQA: Crossing Format Boundaries With a Single QA System

allenai/unifiedqa Findings of the Association for Computational Linguistics 2020

As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats.

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

allenai/dolma NA 2021

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Training Compute-Optimal Large Language Models

karpathy/llama2.c 29 Mar 2022

We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget.

UL2: Unifying Language Learning Paradigms

google-research/google-research 10 May 2022

Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.

Solving Quantitative Reasoning Problems with Language Models

gair-nlp/abel 29 Jun 2022

Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding.