LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

27 Apr 2023  ·  Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji ·

Large language models (LLMs) with instruction fine-tuning demonstrate superior generative capabilities. However, these models are resource-intensive. To alleviate this issue, we explore distilling knowledge from instruction-tuned LLMs into much smaller ones. To this end, we carefully develop a large set of 2.58M instructions based on both existing and newly-generated instructions. In addition to being sizable, we design our instructions to cover a broad set of topics to ensure diversity. Extensive analysis of our instruction dataset confirms its diversity, and we generate responses for these instructions using gpt-3.5-turbo. Leveraging these instructions, we fine-tune a diverse herd of models, collectively referred to as LaMini-LM, which includes models from both the encoder-decoder and decoder-only families, with varying sizes. We evaluate the performance of our models using automatic metrics on 15 different natural language processing (NLP) benchmarks, as well as through human assessment. The results demonstrate that our proposed LaMini-LM models are comparable to competitive baselines, while being much smaller in size.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Sentence Completion HellaSwag LaMini-GPT 1.5B Accuracy 48.3 # 70
Sentence Completion HellaSwag LaMini-T5 738M Accuracy 40.6 # 76
Sentence Completion HellaSwag FLAN-T5-Large 783M Accuracy 48.7 # 69
Sentence Completion HellaSwag T5-Large 738M Accuracy 38.9 # 78
Sentence Completion HellaSwag LaMini-F-T5 783M Accuracy 43.7 # 72
Sentence Completion HellaSwag GPT-2-XL 1.5B Accuracy 50.9 # 66
Natural Language Inference MultiNLI LaMini-F-T5 783M Matched 61.4 # 54
Mismatched 61 # 44
Natural Language Inference MultiNLI GPT-2-XL 1.5B Matched 36.5 # 56
Mismatched 37 # 46
Natural Language Inference MultiNLI LaMini-GPT 1.5B Matched 67.5 # 53
Mismatched 69.3 # 42
Natural Language Inference MultiNLI LaMini-T5 738M Matched 54.7 # 55
Mismatched 55.8 # 45
Natural Language Inference MultiNLI T5-Large 738M Matched 72.4 # 46
Mismatched 72 # 38
Question Answering OpenBookQA LaMini-T5 738M Accuracy 36 # 40
Question Answering OpenBookQA LaMini-F-T5 783M Accuracy 34 # 41
Question Answering OpenBookQA GPT-2-XL 1.5B Accuracy 32 # 43
Question Answering OpenBookQA LaMini-GPT 1.5B Accuracy 39.8 # 39
Question Answering OpenBookQA FLAN-T5-Large 783M Accuracy 31.2 # 44
Question Answering OpenBookQA T5-Large 738M Accuracy 32.8 # 42
Question Answering PIQA T5-Large 738M Accuracy 55.9 # 64
Question Answering PIQA FLAN-T5-Large 783M Accuracy 72.2 # 52
Question Answering PIQA LaMini-T5 738M Accuracy 67.2 # 59
Question Answering PIQA LaMini-GPT 1.5B Accuracy 71.3 # 53
Question Answering PIQA LaMini-F-T5 783M Accuracy 70.6 # 54
Question Answering PIQA GPT-2-XL 1.5B Accuracy 70.5 # 55
Natural Language Inference RTE GPT-2-XL 1.5B Accuracy 52.3% # 89
Natural Language Inference RTE T5-Large 738M Accuracy 87.4% # 20
Natural Language Inference RTE LaMini-GPT 1.5B Accuracy 67.9% # 61
Natural Language Inference RTE LaMini-F-T5 783M Accuracy 65% # 65
Natural Language Inference RTE LaMini-T5 738M Accuracy 57% # 81
Coreference Resolution Winograd Schema Challenge LaMini-T5 738M Accuracy 59 # 61
Coreference Resolution Winograd Schema Challenge LaMini-F-T5 783M Accuracy 64.1 # 44
Coreference Resolution Winograd Schema Challenge LaMini-GPT 1.5B Accuracy 69.6 # 35
Coreference Resolution Winograd Schema Challenge T5-Large 738M Accuracy 66.7 # 40
Coreference Resolution Winograd Schema Challenge GPT-2-XL 1.5B Accuracy 73.3 # 29
Common Sense Reasoning WinoGrande LaMini-GPT 1.5B Accuracy 56 # 60
Common Sense Reasoning WinoGrande LaMini-F-T5 783M Accuracy 56 # 60
Common Sense Reasoning WinoGrande LaMini-T5 738M Accuracy 54.9 # 65
Common Sense Reasoning WinoGrande T5-Large 738M Accuracy 55.2 # 64
Common Sense Reasoning WinoGrande FLAN-T5-Large 783M Accuracy 59.9 # 51
Common Sense Reasoning WinoGrande GPT-2-XL 1.5B Accuracy 58.3 # 56
Word Sense Disambiguation Words in Context LaMini-GPT 1.5B Accuracy 52.4 # 26
Word Sense Disambiguation Words in Context LaMini-T5 738M Accuracy 50.5 # 32
Word Sense Disambiguation Words in Context GPT-2-XL 1.5B Accuracy 49.8 # 34
Word Sense Disambiguation Words in Context LaMini-F-T5 783M Accuracy 63.8 # 16
Word Sense Disambiguation Words in Context FLAN-T5-Large 783M Accuracy 64.7 # 15

Methods


No methods listed for this paper. Add relevant methods here