TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Continual Learning	20Newsgroup (10 tasks)	HAT	F1 - macro	0.9521	# 2
Continual Learning	ASC (19 tasks)	HAT	F1 - macro	0.7816	# 7
Continual Learning	DSC (10 tasks)	HAT	F1 - macro	0.8614	# 3
Continual Learning	F-CelebA (10 tasks)	HAT	Acc	0.5673	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/overcoming-catastrophic-forgetting-with-hard/continual-learning-on-20newsgroup-10-tasks)](https://paperswithcode.com/sota/continual-learning-on-20newsgroup-10-tasks?p=overcoming-catastrophic-forgetting-with-hard)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/overcoming-catastrophic-forgetting-with-hard/continual-learning-on-dsc-10-tasks)](https://paperswithcode.com/sota/continual-learning-on-dsc-10-tasks?p=overcoming-catastrophic-forgetting-with-hard)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/overcoming-catastrophic-forgetting-with-hard/continual-learning-on-f-celeba-10-tasks)](https://paperswithcode.com/sota/continual-learning-on-f-celeba-10-tasks?p=overcoming-catastrophic-forgetting-with-hard)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/overcoming-catastrophic-forgetting-with-hard/continual-learning-on-asc-19-tasks)](https://paperswithcode.com/sota/continual-learning-on-asc-19-tasks?p=overcoming-catastrophic-forgetting-with-hard)`

Overcoming catastrophic forgetting with hard attention to the task

ICML 2018 · Joan Serrà, Dídac Surís, Marius Miron, Alexandros Karatzoglou ·

Catastrophic forgetting occurs when a neural network loses the information learned in a previous task after training on subsequent tasks. This problem remains a hurdle for artificial intelligence systems with sequential learning capabilities. In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning. A hard attention mask is learned concurrently to every task, through stochastic gradient descent, and previous masks are exploited to condition such learning. We show that the proposed mechanism is effective for reducing catastrophic forgetting, cutting current rates by 45 to 80%. We also show that it is robust to different hyperparameter choices, and that it offers a number of monitoring capabilities. The approach features the possibility to control both the stability and compactness of the learned knowledge, which we believe makes it also attractive for online learning or network compression applications.

PDF Abstract ICML 2018 PDF ICML 2018 Abstract

Code

Add Remove Mark official

joansj/hat official

192

chilung/hat

Tasks

Add Remove

Continual Learning

Hard Attention

Datasets

CIFAR-10

CIFAR-100

SVHN

Fashion-MNIST ASC (TIL, 19 tasks) 20Newsgroup (10 tasks) DSC (10 tasks) F-CelebA (10 tasks)

Results from the Paper

Edit

Ranked #2 on Continual Learning on 20Newsgroup (10 tasks)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Continual Learning	20Newsgroup (10 tasks)	HAT	F1 - macro	0.9521	# 2	Compare
Continual Learning	ASC (19 tasks)	HAT	F1 - macro	0.7816	# 7	Compare
Continual Learning	DSC (10 tasks)	HAT	F1 - macro	0.8614	# 3	Compare
Continual Learning	F-CelebA (10 tasks)	HAT	Acc	0.5673	# 6	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Overcoming catastrophic forgetting with hard attention to the task

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove