MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models

9 Nov 2019Linqing LiuHuan WangJimmy LinRichard SocherCaiming Xiong

Pretrained language models have led to significant performance gains in many NLP tasks. However, the intensive computing resources to train such models remain an issue... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.