TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Multi-domain Dialogue State Tracking	MULTIWOZ 2.1	ConvBERT-DG + Multi	Joint Acc	58.7	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dialoglue-a-natural-language-understanding/multi-domain-dialogue-state-tracking-on-1)](https://paperswithcode.com/sota/multi-domain-dialogue-state-tracking-on-1?p=dialoglue-a-natural-language-understanding)`

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

28 Sep 2020 · Shikib Mehri, Mihail Eric, Dilek Hakkani-Tur ·

A long-standing goal of task-oriented dialogue research is the ability to flexibly adapt dialogue models to new domains. To progress research in this direction, we introduce DialoGLUE (Dialogue Language Understanding Evaluation), a public benchmark consisting of 7 task-oriented dialogue datasets covering 4 distinct natural language understanding tasks, designed to encourage dialogue research in representation-based transfer, domain adaptation, and sample-efficient task learning. We release several strong baseline models, demonstrating performance improvements over a vanilla BERT architecture and state-of-the-art results on 5 out of 7 tasks, by pre-training on a large open-domain dialogue corpus and task-adaptive self-supervised training. Through the DialoGLUE benchmark, the baseline methods, and our evaluation scripts, we hope to facilitate progress towards the goal of developing more general task-oriented dialogue models.

PDF Abstract