Dialogue Evaluation

48 papers with code • 2 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation

royny/self-eval COLING 2022

This paper introduces a novel Self-supervised Fine-grained Dialogue Evaluation framework (SelF-Eval).

3
17 Aug 2022

Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian

dialogue-evaluation/ruatd 3 Jun 2022

The first task is framed as a binary classification problem.

17
03 Jun 2022

InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

prakharguptaz/Instructdial 25 May 2022

We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets.

93
25 May 2022

RuNNE-2022 Shared Task: Recognizing Nested Named Entities

dialogue-evaluation/runne 23 May 2022

In the test set the frequency of all entity types is even.

13
23 May 2022

What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation

alexa/conture Findings (ACL) 2022

Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect.

5
25 Mar 2022

DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations

pluslabnlp/deam ACL 2022

We also show that DEAM can distinguish between coherent and incoherent dialogues generated by baseline manipulations, whereas those baseline models cannot detect incoherent examples generated by DEAM.

7
18 Mar 2022

Achieving Reliable Human Assessment of Open-Domain Dialogue Systems

tianboji/dialogue-eval ACL 2022

Answering the distress call of competitions that have emphasized the urgent need for better evaluation techniques in dialogue, we present the successful development of human evaluation that is highly reliable while still remaining feasible and low cost.

8
11 Mar 2022

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

e0397123/mdd-eval 14 Dec 2021

Chatbots are designed to carry out human-like conversations across different domains, such as general chit-chat, knowledge exchange, and persona-grounded conversations.

7
14 Dec 2021

Automatic Evaluation and Moderation of Open-domain Dialogue Systems

e0397123/dstc10_metric_track 3 Nov 2021

The development of Open-Domain Dialogue Systems (ODS)is a trending topic due to the large number of research challenges, large societal and business impact, and advances in the underlying technology.

19
03 Nov 2021

A Human-machine Collaborative Framework for Evaluating Malevolence in Dialogues

repozhang/case_hmceval ACL 2021

HMCEval casts dialogue evaluation as a sample assignment problem, where we need to decide to assign a sample to a human or a machine for evaluation.

6
01 Aug 2021