Multi-task Language Understanding
32 papers with code • 4 benchmarks • 5 datasets
The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf
Libraries
Use these libraries to find Multi-task Language Understanding models and implementationsMost implemented papers
Atlas: Few-shot Learning with Retrieval Augmented Language Models
Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings.
Galactica: A Large Language Model for Science
We believe these results demonstrate the potential for language models as a new interface for science.
REPLUG: Retrieval-Augmented Black-Box Language Models
We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.
PaLM 2 Technical Report
Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.
Textbooks Are All You Need II: phi-1.5 technical report
We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1. 3 billion parameter model with Python coding performance close to the state-of-the-art.
Are Human-generated Demonstrations Necessary for In-context Learning?
In this paper, we raise the fundamental question that whether human-generated demonstrations are necessary for ICL.
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU
In this work, we introduce IndoMMLU, the first multi-task language understanding benchmark for Indonesian culture and languages, which consists of questions from primary school to university entrance exams in Indonesia.
MiLe Loss: a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
Experiments reveal that models incorporating the proposed MiLe Loss can gain consistent performance improvement on downstream benchmarks.
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
Instruction tuning, a successful paradigm, enhances the ability of LLMs to follow natural language instructions and exhibit robust generalization across a wide range of tasks.
Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration
In this paper, we propose an architecture to harness the collective knowledge of multiple trained LLMs to create a new state-of-the-art.