LAMBADA

12 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in LAMBADA

Trend	Dataset	Best Model	Paper	Code	Compare
	BIG-bench	Chinchilla-70B (zero-shot)			See all

Datasets

BIG-bench

Most implemented papers

Most implemented Social Latest No code

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

NVIDIA/Megatron-LM • • 17 Sep 2019

To demonstrate that large language models can further advance the state of the art (SOTA), we train an 8. 3 billion parameter transformer language model similar to GPT-2 and a 3. 9 billion parameter model similar to BERT.

Paper
Code

Universal Transformers

tensorflow/tensor2tensor • • ICLR 2019

Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.

Paper
Code

The LAMBADA dataset: Word prediction requiring a broad discourse context

keyonvafa/sequential-rationales • • ACL 2016

We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task.

Paper
Code

Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences

LUMII-Syslab/RSE • • 6 Apr 2020

Attention is a commonly used mechanism in sequence processing, but it is of O(n^2) complexity which prevents its application to long sequences.

Paper
Code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

allenai/dolma • NA 2021

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Paper
Code

Training Compute-Optimal Large Language Models

karpathy/llama2.c • • 29 Mar 2022

We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget.

Paper
Code

Entity Tracking Improves Cloze-style Reading Comprehension

harvardnlp/readcomp • • EMNLP 2018

Reading comprehension tasks test the ability of models to process long-term context and remember salient information.

Paper
Code

Neural Shuffle-Exchange Networks -- Sequence Processing in O(n log n) Time

LUMII-Syslab/shuffle-exchange • • 18 Jul 2019

A key requirement in sequence to sequence processing is the modeling of long range dependencies.

Paper
Code

Not Enough Data? Deep Learning to the Rescue!

makcedward/nlpaug • • 8 Nov 2019

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks.

Paper
Code

Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time

LUMII-Syslab/shuffle-exchange • • NeurIPS 2019

A key requirement in sequence to sequence processing is the modeling of long range dependencies.

Paper
Code

LAMBADA

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result