Human vs Automatic Metrics: on the Importance of Correlation Design

29 May 2018Anastasia Shimorina

This paper discusses two existing approaches to the correlation analysis between automatic evaluation metrics and human scores in the area of natural language generation. Our experiments show that depending on the usage of a system- or sentence-level correlation analysis, correlation results between automatic scores and human judgments are inconsistent...

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.