Handling Rare Items in Data-to-Text Generation

WS 2018  ·  Anastasia Shimorina, Claire Gardent ·

Neural approaches to data-to-text generation generally handle rare input items using either delexicalisation or a copy mechanism. We investigate the relative impact of these two methods on two datasets (E2E and WebNLG) and using two evaluation settings. We show (i) that rare items strongly impact performance; (ii) that combining delexicalisation and copying yields the strongest improvement; (iii) that copying underperforms for rare and unseen items and (iv) that the impact of these two mechanisms greatly varies depending on how the dataset is constructed and on how it is split into train, dev and test.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
KG-to-Text Generation WebNLG 2.0 (Constrained) SOTA-NPT BLEU 48.0 # 5
METEOR 36.0 # 5
ROUGE 65.0 # 5
KG-to-Text Generation WebNLG 2.0 (Unconstrained) Handling Rare Items in Data-to-Text Generation BLEU 61 # 11
METEOR 42 # 11
ROUGE 71.0 # 11

Methods


No methods listed for this paper. Add relevant methods here