Over the last several years, end-to-end neural conversational agents have vastly improved their ability to carry unrestricted, open-domain conversations with humans.
We then empirically assess the extent to which current tools can measure these effects and current systems display them.
Despite many recent advancements in language modeling, state-of-the-art language models lack grounding in the real world and struggle with tasks involving complex reasoning.
Existing approaches built separate classifiers to detect nonsense in dialogues.
1 code implementation • • Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sash Mitts, Aditya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.
Deployed dialogue agents have the potential to integrate human feedback to continuously improve themselves.
Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans.
Conversational agents trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior.
While improving neural dialogue agents' factual accuracy is the object of much research, another important aspect of communication, less studied in the setting of neural dialogue, is transparency about ignorance.
Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases.
Open-domain conversation models have become good at generating natural-sounding dialogue, using very large architectures with billions of trainable parameters.
As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013).
no code implementations • 22 Jun 2020 • Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson
We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.
We show our classifiers prove valuable for a variety of important applications, such as controlling for gender bias in generative models, detecting gender bias in arbitrary text, and shed light on offensive language in terms of genderedness.
Building open-domain chatbots is a challenging area for machine learning research.
Dialogue research tends to distinguish between chit-chat and goal-oriented tasks.
no code implementations • 20 Nov 2019 • Angela Fan, Jack Urbanek, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye, Douwe Kiela, Tim Rocktaschel, Arthur Szlam, Jason Weston
We show that the game environments created with our approach are cohesive, diverse, and preferred by human evaluators compared to other machine learning based world construction algorithms.
Text style transfer is usually performed using attributes that can take a handful of discrete values (e. g., positive to negative reviews).
Models often easily learn biases present in the training data, and their predictions directly reflect this bias.
We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images.
We introduce a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure.
The detection of offensive language in the context of a dialogue has become an increasingly important application of natural language processing.
Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core.
We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.
3 code implementations • 31 Jan 2019 • Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston
We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots.
In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date.
Sequence generation models for dialogue are known to have several problems: they tend to produce short, generic sentences that are uninformative and unengaging.
Chit-chat models are known to have several problems: they lack specificity, do not display a consistent personality and are often not very captivating.
Ranked #4 on Dialogue Generation on Persona-Chat