no code implementations • 16 Nov 2023 • Evgeniia Razumovskaia, Ivan Vulić, Pavle Marković, Tomasz Cichy, Qian Zheng, Tsung-Hsien Wen, Paweł Budzianowski
Factuality is a crucial requirement in information seeking dialogue: the system should respond to the user's queries so that the responses are meaningful and aligned with the knowledge provided to the system.
no code implementations • EMNLP 2021 • Ivan Vulić, Pei-Hao Su, Sam Coope, Daniela Gerz, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Tsung-Hsien Wen
Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge.
no code implementations • EMNLP 2021 • Daniela Gerz, Pei-Hao Su, Razvan Kusztos, Avishek Mondal, Michał Lis, Eshan Singhal, Nikola Mrkšić, Tsung-Hsien Wen, Ivan Vulić
We present a systematic study on multilingual and cross-lingual intent detection from spoken data.
5 code implementations • Findings of the Association for Computational Linguistics 2020 • Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien Wen, Ivan Vulić
General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train.
Ranked #1 on Conversational Response Selection on PolyAI Reddit
no code implementations • IJCNLP 2019 • Matthew Henderson, Ivan Vulić, Iñigo Casanueva, Paweł Budzianowski, Daniela Gerz, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su
We present PolyResponse, a conversational search engine that supports task-oriented dialogue.
1 code implementation • ACL 2019 • Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su
Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks.
3 code implementations • WS 2019 • Matthew Henderson, Paweł Budzianowski, Iñigo Casanueva, Sam Coope, Daniela Gerz, Girish Kumar, Nikola Mrkšić, Georgios Spithourakis, Pei-Hao Su, Ivan Vulić, Tsung-Hsien Wen
Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches.
BIG-bench Machine Learning Conversational Response Selection +1
1 code implementation • EMNLP 2018 • Pawe{\l} Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, I{\~n}igo Casanueva, Stefan Ultes, Osman Ramadan, Milica Ga{\v{s}}i{\'c}
Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available. To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics. At a size of 10k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora. The contribution of this work apart from the open-sourced dataset is two-fold:firstly, a detailed description of the data collection procedure along with a summary of data structure and analysis is provided.
5 code implementations • EMNLP 2018 • Paweł Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, Iñigo Casanueva, Stefan Ultes, Osman Ramadan, Milica Gašić
Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available.
no code implementations • ICLR 2018 • Tsung-Hsien Wen, Minh-Thang Luong
In this paper, we propose Latent Topic Conversational Model (LTCM) which augments seq2seq with a neural latent topic component to better guide response generation and make training easier.
no code implementations • 29 Nov 2017 • Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić
Dialogue assistants are rapidly becoming an indispensable daily aid.
no code implementations • WS 2017 • Stefan Ultes, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Lina Rojas-Barahona, Pei-Hao Su, Tsung-Hsien Wen, Milica Gašić, Steve Young
Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e. g., the dialogue success and the dialogue length.
Multi-Objective Reinforcement Learning reinforcement-learning +2
no code implementations • WS 2017 • Paweł Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Iñigo Casanueva, Lina Rojas-Barahona, Milica Gašić
In doing that, we show that our approach has the potential to facilitate policy optimisation for more sophisticated multi-domain dialogue systems.
1 code implementation • ICML 2017 • Tsung-Hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
Developing a dialogue agent that is capable of making autonomous decisions and communicating by natural language is one of the long-term goals of machine learning research.
no code implementations • COLING 2016 • Lina M. Rojas Barahona, Milica Gasic, Nikola Mrkšić, Pei-Hao Su, Stefan Ultes, Tsung-Hsien Wen, Steve Young
This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7
no code implementations • 9 Sep 2016 • Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young
Spoken dialogue systems allow humans to interact with machines using natural speech.
no code implementations • ACL 2017 • Nikola Mrkšić, Diarmuid Ó Séaghdha, Tsung-Hsien Wen, Blaise Thomson, Steve Young
One of the core components of modern spoken dialogue systems is the belief tracker, which estimates the user's goal at every step of the dialogue.
no code implementations • EMNLP 2016 • Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, David Vandyke, Steve Young
Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks.
no code implementations • 8 Jun 2016 • Pei-Hao Su, Milica Gasic, Nikola Mrksic, Lina Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young
We describe a two-step approach for dialogue management in task-oriented spoken dialogue systems.
no code implementations • ACL 2016 • Pei-Hao Su, Milica Gasic, Nikola Mrksic, Lina Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen, Steve Young
The ability to compute an accurate reward function is essential for optimising a dialogue policy via reinforcement learning.
1 code implementation • EACL 2017 • Tsung-Hsien Wen, David Vandyke, Nikola Mrksic, Milica Gasic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, Steve Young
Teaching machines to accomplish tasks by conversing naturally with humans is challenging.
no code implementations • NAACL 2016 • Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, Steve Young
Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains.
2 code implementations • NAACL 2016 • Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Lina Rojas-Barahona, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, Steve Young
In this work, we present a novel counter-fitting method which injects antonymy and synonymy constraints into vector space representations in order to improve the vectors' capability for judging semantic similarity.
no code implementations • WS 2015 • Pei-Hao Su, David Vandyke, Milica Gasic, Nikola Mrksic, Tsung-Hsien Wen, Steve Young
Reward shaping is one promising technique for addressing these concerns.
no code implementations • 13 Aug 2015 • Pei-Hao Su, David Vandyke, Milica Gasic, Dongho Kim, Nikola Mrksic, Tsung-Hsien Wen, Steve Young
The models are trained on dialogues generated by a simulated user and the best model is then used to train a policy on-line which is shown to perform at least as well as a baseline system using prior knowledge of the user's task.
2 code implementations • EMNLP 2015 • Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young
Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact both on usability and perceived quality.
no code implementations • WS 2015 • Tsung-Hsien Wen, Milica Gasic, Dongho Kim, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young
The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on.
no code implementations • IJCNLP 2015 • Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, Steve Young
Dialog state tracking is a key component of many modern dialog systems, most of which are designed with a single, well-defined domain in mind.