Semantic Noise Matters for Neural Natural Language Generation

Neural natural language generation (NNLG) systems are known for their pathological outputs, i.e. generating text which is unrelated to the input specification. In this paper, we show the impact of semantic noise on state-of-the-art NNLG models which implement different semantic control mechanisms. We find that cleaned data can improve semantic correctness by up to 97%, while maintaining fluency. We also find that the most common error is omitting information, rather than hallucination.

PDF Abstract WS 2019 PDF WS 2019 Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Data-to-Text Generation Cleaned E2E NLG Challenge TGen BLEU (Test set) 40.73 # 3

Methods


No methods listed for this paper. Add relevant methods here