Learning without Forgetting: Task Aware Multitask Learning for Multi-Modality Tasks

Existing joint learning strategies like multi-task or meta-learning focus more on shared learning and have little to no scope for task-specific learning. This creates the need for a distinct shared pretraining phase and a task-specific finetuning phase. The fine-tuning phase creates separate systems for each task, where improving the performance of a particular task necessitates forgetting some of the knowledge garnered in other tasks. Humans, on the other hand, perform task-specific learning in synergy with general domain-based learning. Inspired by these learning patterns in humans, we suggest a simple yet generic task aware framework to incorporate into existing joint learning strategies. The proposed framework computes task-specific representations to modulate model parameters during joint learning. Hence, it performs both shared and task-specific learning in a single-phase resulting in a single model for all the tasks. The single model itself achieves significant performance gains over the existing joint learning strategies. For example, we train a model on Speech Translation (ST), Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks using the proposed task aware joint learning approach. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61 on ASR TEDLium v3, and BLEU score of 23.35 on MT WMT14 English-German tasks. This sets a new state-of-the-art performance (SOTA) on ST while having close to SOTA performance on ASR and competitive performance on MT tasks.

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.