TextBox 2.0: A Text Generation Library with Pre-trained Language Models

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Generation ADGEN BART (TextBox 2.0) BLEU-4 10.2 # 1
Abstractive Text Summarization CNN/Daily Mail BART (TextBox 2.0) ROUGE-1 44.47 # 1
ROUGE-2 21.5 # 1
ROUGE-L 41.35 # 1
Text Generation CommonGen BART (TextBox 2.0) CIDEr 12.98 # 2
BLEU-4 28.18 # 1
SPICE 33 # 1
Text Generation CSL BART (TextBox 2.0) ROUGE-L 64.34 # 1
Style Transfer GYAFC BART (TextBox 2.0) BLEU-4 76.93 # 1
Accuracy 94.37 # 1
Harmonic mean 84.74 # 1
Text Generation LCSTS BART (TextBox 2.0) ROUGE-L 42.96 # 1
Task-Oriented Dialogue Systems MULTIWOZ 2.0 BART (TextBox 2.0) BLEU-4 20.17 # 1
Score 100.07 # 1
Dialogue Persona-Chat BART (TextBox 2.0) BLEU-1 49.581 # 1
BLEU-2 39.24 # 1
Distinct-1 1.44 # 1
Distinct-2 8.89 # 1
Question Generation SQuAD1.1 BART (TextBox 2.0) BLEU-4 25.08 # 2
METEOR 26.73 # 1
ROUGE-L 52.55 # 3
Question Answering SQuAD1.1 BART (TextBox 2.0) F1 93.04 # 18
Exact Match 86.44 # 1
Data-to-Text Generation WebNLG BART (TextBox 2.0) BLEU-4 67.33 # 1
METEOR 47.78 # 1
ROUGE-L 76.83 # 1
Text Simplification Wiki-Auto + Turk BART (TextBox 2.0) BLEU-4 90.81 # 1
METEOR 57.58 # 1
ROUGE-2 83.36 # 1
Machine Translation WMT2016 English-Romanian BART (TextBox 2.0) BLEU-4 37.2 # 1
Machine Translation WMT2016 Romanian-English BART (TextBox 2.0) BLEU-4 37.48 # 1
Story Generation WritingPrompts BART (TextBox 2.0) BLEU-1 33.79 # 1
BLEU-2 15.78 # 1
Distinct-4 78.762 # 1

Methods