Open-domain dialog systems aim to generate relevant, informative and engaging responses.
We further investigate can such models identify when to generate implicit background knowledge and when it is not necessary.
Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect.
Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research.
Automatic evaluation is beneficial for open-domain dialog system development.
Implicit knowledge, such as common sense, is key to fluid human conversations.
We demonstrate the benefit of our Conv-FEVER dataset by showing that the models trained on this data perform reasonably well to detect factually inconsistent responses with respect to the provided knowledge through evaluation on our human annotated data.
Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets.
Moreover, existing dialogue datasets do not explicitly focus on exhibiting commonsense as a facet.
Towards improving language models' social intelligence, we focus on the Social IQA dataset, a task requiring social and emotional commonsense reasoning.
Pretrained language models have excelled at many NLP tasks recently; however, their social intelligence is still unsatisfactory.
This challenge track aims to expand the coverage of task-oriented dialogue systems by incorporating external unstructured knowledge sources.
1 code implementation • 12 Nov 2020 • Chulaka Gunasekara, Seokhwan Kim, Luis Fernando D'Haro, Abhinav Rastogi, Yun-Nung Chen, Mihail Eric, Behnam Hedayatnia, Karthik Gopalakrishnan, Yang Liu, Chao-Wei Huang, Dilek Hakkani-Tür, Jinchao Li, Qi Zhu, Lingxiao Luo, Lars Liden, Kaili Huang, Shahin Shayandeh, Runze Liang, Baolin Peng, Zheng Zhang, Swadheen Shukla, Minlie Huang, Jianfeng Gao, Shikib Mehri, Yulan Feng, Carla Gordon, Seyed Hossein Alavi, David Traum, Maxine Eskenazi, Ahmad Beirami, Eunjoon, Cho, Paul A. Crook, Ankita De, Alborz Geramifard, Satwik Kottur, Seungwhan Moon, Shivani Poddar, Rajen Subba
Interactive evaluation of dialog, and 4.
Large end-to-end neural open-domain chatbots are becoming increasingly popular.
In this paper, we propose to expand coverage of task-oriented dialogue systems by incorporating external unstructured knowledge sources.
In this paper, we propose using a dialogue policy to plan the content and style of target responses in the form of an action plan, which includes knowledge sentences related to the dialogue context, targeted dialogue acts, topic information, etc.
We introduce Topical-Chat, a knowledge-grounded human-human conversation dataset where the underlying knowledge spans 8 broad topics and conversation partners don’t have explicitly defined roles, to help further research in open-domain conversational AI.
Having explicit feedback on the relevance and interestingness of a system response at each turn can be a useful signal for mitigating such issues and improving system quality by selecting responses from different approaches.
Our experiments show the feasibility of learning statistical NLG models for open-domain QA with larger ontologies.
no code implementations • 27 Dec 2018 • Chandra Khatri, Behnam Hedayatnia, Anu Venkatesh, Jeff Nunn, Yi Pan, Qing Liu, Han Song, Anna Gottardi, Sanjeev Kwatra, Sanju Pancholi, Ming Cheng, Qinglang Chen, Lauren Stubel, Karthik Gopalakrishnan, Kate Bland, Raefer Gabriel, Arindam Mandal, Dilek Hakkani-Tur, Gene Hwang, Nate Michel, Eric King, Rohit Prasad
In the second iteration of the competition in 2018, university teams advanced the state of the art by using context in dialog models, leveraging knowledge graphs for language understanding, handling complex utterances, building statistical and hierarchical dialog managers, and leveraging model-driven signals from user responses.
We train models using publicly available annotated datasets as well as using the proposed large-scale semi-supervised datasets.
On annotated data, we show that incorporating context and dialog acts leads to relative gains in topic classification accuracy by 35% and on unsupervised keyword detection recall by 11% for conversational interactions where topics frequently span multiple utterances.
Statistical language models (LM) play a key role in Automatic Speech Recognition (ASR) systems used by conversational agents.
no code implementations • 11 Jan 2018 • Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, Art Pettigrue
This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational
no code implementations • 11 Jan 2018 • Anu Venkatesh, Chandra Khatri, Ashwin Ram, Fenfei Guo, Raefer Gabriel, Ashish Nagar, Rohit Prasad, Ming Cheng, Behnam Hedayatnia, Angeliki Metallinou, Rahul Goel, Shaohua Yang, Anirudh Raju
In this paper, we propose a comprehensive evaluation strategy with multiple metrics designed to reduce subjectivity by selecting metrics which correlate well with human judgement.