Multi-task Language Understanding

32 papers with code • 4 benchmarks • 5 datasets

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Libraries

Use these libraries to find Multi-task Language Understanding models and implementations
2 papers
57,596
See all 10 libraries.

Most implemented papers

Atlas: Few-shot Learning with Retrieval Augmented Language Models

facebookresearch/atlas 5 Aug 2022

Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings.

Galactica: A Large Language Model for Science

paperswithcode/galai 16 Nov 2022

We believe these results demonstrate the potential for language models as a new interface for science.

REPLUG: Retrieval-Augmented Black-Box Language Models

intellabs/fastrag 30 Jan 2023

We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.

PaLM 2 Technical Report

eternityyw/tram-benchmark 17 May 2023

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

Textbooks Are All You Need II: phi-1.5 technical report

knowlab/bi-weekly-paper-presentation 11 Sep 2023

We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1. 3 billion parameter model with Python coding performance close to the state-of-the-art.

Are Human-generated Demonstrations Necessary for In-context Learning?

ruili33/sec 26 Sep 2023

In this paper, we raise the fundamental question that whether human-generated demonstrations are necessary for ICL.

Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU

fajri91/indommlu 7 Oct 2023

In this work, we introduce IndoMMLU, the first multi-task language understanding benchmark for Indonesian culture and languages, which consists of questions from primary school to university entrance exams in Indonesia.

MiLe Loss: a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models

suu990901/LLaMA-InfoEntropy-Loss 30 Oct 2023

Experiments reveal that models incorporating the proposed MiLe Loss can gain consistent performance improvement on downstream benchmarks.

Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks

wuhy68/parameter-efficient-moe 5 Jan 2024

Instruction tuning, a successful paradigm, enhances the ability of LLMs to follow natural language instructions and exhibit robust generalization across a wide range of tasks.

Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration

leeroo-ai/leeroo_orchestrator 25 Jan 2024

In this paper, we propose an architecture to harness the collective knowledge of multiple trained LLMs to create a new state-of-the-art.