Spoken Dialogue Systems
19 papers with code • 0 benchmarks • 2 datasets
Benchmarks
These leaderboards are used to track progress in Spoken Dialogue Systems
Libraries
Use these libraries to find Spoken Dialogue Systems models and implementationsMost implemented papers
Variational Cross-domain Natural Language Generation for Spoken Dialogue Systems
Cross-domain natural language generation (NLG) is still a difficult task within spoken dialogue modelling.
A dataset for resolving referring expressions in spoken dialogue via contextual query rewrites (CQR)
In this paper, we describe our methodology for creating the query reformulation extension to the dialog corpus, and present an initial set of experiments to establish a baseline for the CQR task.
Hierarchical Multi-Task Natural Language Understanding for Cross-domain Conversational AI: HERMIT NLU
We present a new neural architecture for wide-coverage Natural Language Understanding in Spoken Dialogue Systems.
Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks
Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue.
"How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations
Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets.
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification
Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services.
When can I Speak? Predicting initiation points for spoken dialogue agents
Current spoken dialogue systems initiate their turns after a long period of silence (700-1000ms), which leads to little real-time feedback, sluggish responses, and an overall stilted conversational flow.
OLISIA: a Cascade System for Spoken Dialogue State Tracking
Though Dialogue State Tracking (DST) is a core component of spoken dialogue systems, recent work on this task mostly deals with chat corpora, disregarding the discrepancies between spoken and written language. In this paper, we propose OLISIA, a cascade system which integrates an Automatic Speech Recognition (ASR) model and a DST model.
Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
This paper explores the potential of constructing an AI spoken dialogue system that "thinks how to respond" and "thinks how to speak" simultaneously, which more closely aligns with the human speech production process compared to the current cascade pipeline of independent chatbot and Text-to-Speech (TTS) modules.