Designing Less Forgetful Networks for Continual Learning

29 Sep 2021 · Nicholas I-Hsien Kuo, Mehrtash Harandi, Nicolas Fourrier, Gabriela Ferraro, Christian Walder, Hanna Suominen ·

Neural networks usually excel in learning a single task. Their weights are plastic and help them to learn quickly, but these weights are also known to be unstable. Hence, they may experience catastrophic forgetting and lose the ability to solve past tasks when assimilating information to solve a new task. Existing methods have mostly attempted to address this problem through external constraints. Replay shows the backbone network externally stored memories; regularisation imposes additional learning objectives; and dynamic architecture often introduces more parameters to host new knowledge. In contrast, we look for internal means to create less forgetful networks. This paper demonstrates that two simple architectural modifications -- Masked Highway Connection and Layer-Wise Normalisation -- can drastically reduce the forgetfulness in a backbone network. When naively employed to sequentially learn over multiple tasks, our modified backbones were as competitive as those unmodified backbones with continual learning techniques applied. Furthermore, our proposed architectural modifications were compatible with most if not all continual learning archetypes and therefore helped those respective techniques in achieving new state of the art.

PDF Abstract