TAG: Task-based Accumulated Gradients for Lifelong learning

11 May 2021  ·  Pranshu Malviya, Balaraman Ravindran, Sarath Chandar ·

When an agent encounters a continual stream of new tasks in the lifelong learning setting, it leverages the knowledge it gained from the earlier tasks to help learn the new tasks better. In such a scenario, identifying an efficient knowledge representation becomes a challenging problem. Most research works propose to either store a subset of examples from the past tasks in a replay buffer, dedicate a separate set of parameters to each task or penalize excessive updates over parameters by introducing a regularization term. While existing methods employ the general task-agnostic stochastic gradient descent update rule, we propose a task-aware optimizer that adapts the learning rate based on the relatedness among tasks. We utilize the directions taken by the parameters during the updates by accumulating the gradients specific to each task. These task-based accumulated gradients act as a knowledge base that is maintained and updated throughout the stream. We empirically show that our proposed adaptive learning rate not only accounts for catastrophic forgetting but also allows positive backward transfer. We also show that our method performs better than several state-of-the-art methods in lifelong learning on complex datasets with a large number of tasks.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Continual Learning 5-dataset - 1 epoch TAG-RMSProp Accuracy 62.59 # 1
Continual Learning Cifar100 (20 tasks) - 1 epoch TAG-RMSProp Average Accuracy 62.79 # 1
Continual Learning CUB-200-2011 (20 tasks) - 1 epoch TAG-RMSProp Accuracy 61.58 # 1
Continual Learning mini-Imagenet (20 tasks) - 1 epoch TAG-RMSProp Accuracy 57.2 # 1

Methods


No methods listed for this paper. Add relevant methods here