Butterfly Effects in Frame Semantic Parsing: impact of data processing on model ranking

COLING 2018 · Alex Kabbach, re, Corentin Ribeyre, Aur{\'e}lie Herbelot ·

Knowing the state-of-the-art for a particular task is an essential component of any computational linguistics investigation. But can we be truly confident that the current state-of-the-art is indeed the best performing model? In this paper, we study the case of frame semantic parsing, a well-established task with multiple shared datasets. We show that in spite of all the care taken to provide a standard evaluation resource, small variations in data processing can have dramatic consequences for ranking parser performance. This leads us to propose an open-source standardized processing pipeline, which can be shared and reused for robust model comparison.