Memorization

269 papers with code • 1 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Memorization

Trend	Dataset	Best Model	Paper	Code	Compare
	BIG-bench (Hindu Knowledge)	PaLM-540B (few-shot, k=5)			See all

Libraries

Use these libraries to find Memorization models and implementations

faceonlive/ai-research

2 papers

152

smilelab-fl/fednoisy

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

mixup: Beyond Empirical Risk Minimization

facebookresearch/mixup-cifar10 • • ICLR 2018

We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

Paper
Code

Wide & Deep Learning for Recommender Systems

microsoft/recommenders • • 24 Jun 2016

Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort.

Paper
Code

Neural Machine Translation in Linear Time

paarthneekhara/byteNet-tensorflow • • 31 Oct 2016

The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.

Paper
Code

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels

bhanML/Co-teaching • • NeurIPS 2018

Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training.

Paper
Code

Generalization through Memorization: Nearest Neighbor Language Models

urvashik/knnlm • • ICLR 2020

Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.

Paper
Code

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

openai/grok • • 6 Jan 2022

In this paper we propose to study generalization of neural networks on small algorithmically generated datasets.

Paper
Code

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch • • Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

Paper
Code