no code implementations • EACL 2017 • Yvette Graham, Qingsong Ma, Timothy Baldwin, Qun Liu, Carla Parra, Carolina Scarton
Meaningful conclusions about the relative performance of NLP systems are only possible if the gold standard employed in a given evaluation is both valid and reliable.