Recently, instruction tuning (IT), which fine-tunes a pre-trained language model on a massive collection of tasks described via human-craft instructions, has been shown effective in instruction learning for unseen tasks.
Despite the success of text-to-text pre-trained models in various natural language generation (NLG) tasks, the generation performance is largely restricted by the number of labeled data in downstream tasks, particularly in data-to-text generation tasks.
Existing reference-free metrics have obvious limitations for evaluating controlled text generation models.
Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems.
We provide both empirical and theoretical evidence to show that our method effectively removes the biases existing in the original distinct score.
2 code implementations • 3 Aug 2021 • Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang
Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones.
2 code implementations • 20 Jun 2021 • Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan YAO, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
Existing pre-trained models for knowledge-graph-to-text (KG-to-text) generation simply fine-tune text-to-text pre-trained models such as BART or T5 on KG-to-text datasets, which largely ignore the graph structure during encoding and lack elaborate pre-training tasks to explicitly model graph-text alignments.
Ranked #1 on KG-to-Text Generation on WebQuestions
This paper propose to combine pretrained language models with the modular dialogue paradigm for open-domain dialogue modeling.
Despite the great success of Generative Adversarial Networks (GANs) in generating high-quality images, GANs for text generation still face two major challenges: first, most text GANs are unstable in training mainly due to ineffective optimization of the generator, and they heavily rely on maximum likelihood pretraining; second, most text GANs adopt autoregressive generators without latent variables, which largely limits the ability to learn latent representations for natural language text.
3 code implementations • 1 Dec 2020 • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun
However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available.
Despite the success of generative pre-trained language models on a series of text generation tasks, they still suffer in cases where reasoning over underlying commonsense knowledge is required during generation.
Commonsense explanation generation aims to empower the machine's sense-making capability by generating plausible explanations to statements against commonsense.
The cleaned dataset and the pre-training models will facilitate the research of short-text conversation modeling.
In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions.
To benefit the downstream tasks in sentiment analysis, we propose a novel language representation model called SentiLARE, which introduces word-level linguistic knowledge including part-of-speech tag and sentiment polarity (inferred from SentiWordNet) into pre-trained models.
The generator is optimized with maximum likelihood estimation augmented by the discriminator's rewards instead of policy gradient.
Experiments show that our model outperforms state-of-the-art baselines, and it has the ability to generate responses with both controlled sentence function and informative content.