Open-Domain Dialog
32 papers with code • 1 benchmarks • 13 datasets
Datasets
Most implemented papers
KILT: a Benchmark for Knowledge Intensive Language Tasks
We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance.
ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation
ProphetNet is a pre-training based natural language generation method which shows powerful performance on English text summarization and question generation tasks.
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems
To investigate the strengths of this novel metric and interactive evaluation in comparison to state-of-the-art metrics and human evaluation of static conversations, we perform extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures through sentiment and semantic knowledge distillation on the utterance level.
Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References
The aim of this paper is to mitigate the shortcomings of automatic evaluation of open-domain dialog systems through multi-reference evaluation.
Unsupervised Evaluation of Interactive Dialog with DialoGPT
It is important to define meaningful and interpretable automatic evaluation metrics for open-domain dialog research.
Dialogue Response Ranking Training with Large-Scale Human Feedback Data
Particularly, our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
Hurdles to Progress in Long-form Question Answering
The task of long-form question answering (LFQA) involves retrieving documents relevant to a given question and using them to generate a paragraph-length answer.
RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems
Open-domain human-computer conversation has been attracting increasing attention over the past few years.
Augmenting Neural Response Generation with Context-Aware Topical Attention
Our model is built upon the basic Seq2Seq model by augmenting it with a hierarchical joint attention mechanism that incorporates topical concepts and previous interactions into the response generation.
Evaluating Coherence in Dialogue Systems using Entailment
Evaluating open-domain dialogue systems is difficult due to the diversity of possible correct answers.