Lingxi: A Diversity-aware Chinese Modern Poetry Generation System

27 Aug 2021  ·  Xinran Zhang, Maosong Sun, Jiafeng Liu, Xiaobing Li ·

Poetry generation has been a difficult task in natural language processing. Unlike plain neural text generation tasks, poetry has a high requirement for novelty, since an easily-understood sentence with too many high frequency words might not be considered as poetic, while adequately ambiguous sentences with low frequency words can possibly be novel and creative. Inspired by this, we present Lingxi, a diversity-aware Chinese modern poetry generation system. We propose nucleus sampling with randomized head (NS-RH) algorithm, which randomizes the high frequency part ("head") of the predicted distribution, in order to emphasize on the "comparatively low frequency" words. The proposed algorithm can significantly increase the novelty of generated poetry compared with traditional sampling methods. The permutation of distribution is controllable by tuning the filtering parameter that determines the "head" to permutate, achieving diversity-aware sampling. We find that even when a large portion of filtered vocabulary is randomized, it can actually generate fluent poetry but with notably higher novelty. We also propose a semantic-similarity-based rejection sampling algorithm, which creates longer and more informative context on the basis of the short input poetry title while maintaining high semantic similarity to the title, alleviating the off-topic issue.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here