Search Results for author: Kurt Shuster

Found 32 papers, 13 papers with code

Improving Open Language Models by Learning from Organic Interactions

no code implementations7 Jun 2023 Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety.

Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models

no code implementations26 Apr 2023 Jimmy Wei, Kurt Shuster, Arthur Szlam, Jason Weston, Jack Urbanek, Mojtaba Komeili

We compare models trained on our new dataset to existing pairwise-trained dialogue models, as well as large language models with few-shot prompting.

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation22 Dec 2022 Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Language Modelling Meta-Learning +2

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

no code implementations21 Dec 2022 Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt Shuster, François Charton, Charysse Redwood

Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization.

Few-Shot Learning Language Modelling +2

The CRINGE Loss: Learning what language not to model

no code implementations10 Nov 2022 Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples.

Language Modelling

When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

no code implementations28 Oct 2022 Weiyan Shi, Emily Dinan, Kurt Shuster, Jason Weston, Jing Xu

Deployed dialogue agents have the potential to integrate human feedback to continuously improve themselves.


BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

2 code implementations5 Aug 2022 Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

DIRECTOR: Generator-Classifiers For Supervised Language Modeling

1 code implementation15 Jun 2022 Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions.

Language Modelling

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion

1 code implementation24 Mar 2022 Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston

We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness.

Language Modelling Retrieval

Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity

no code implementations Findings (NAACL) 2022 Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction.

Internet-Augmented Dialogue Generation

no code implementations ACL 2022 Mojtaba Komeili, Kurt Shuster, Jason Weston

The largest store of continually updating knowledge on our planet can be accessed via internet search.

Dialogue Generation Retrieval

Retrieval Augmentation Reduces Hallucination in Conversation

no code implementations Findings (EMNLP) 2021 Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston

Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020).

Hallucination Retrieval

Multi-Modal Open-Domain Dialogue

no code implementations EMNLP 2021 Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston

Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020).

Visual Dialog

Deploying Lifelong Open-Domain Dialogue Learning

no code implementations18 Aug 2020 Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston

As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013).

Image-Chat: Engaging Grounded Conversations

no code implementations ACL 2020 Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston

To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

no code implementations22 Jun 2020 Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.

Continual Learning

All-in-One Image-Grounded Conversational Agents

no code implementations28 Dec 2019 Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston

As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore.

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

no code implementations ACL 2020 Kurt Shuster, Da Ju, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston

We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images.

Wizard of Wikipedia: Knowledge-Powered Conversational agents

2 code implementations ICLR 2019 Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, Jason Weston

In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date.

Dialogue Generation

Image Chat: Engaging Grounded Conversations

3 code implementations2 Nov 2018 Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston

To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).

Text Retrieval

Engaging Image Captioning Via Personality

no code implementations CVPR 2019 Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, Jason Weston

While such tasks are useful to verify that a machine understands the content of an image, they are not engaging to humans as captions.

Image Captioning Sentence

Talk the Walk: Navigating New York City through Grounded Dialogue

1 code implementation9 Jul 2018 Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela

We introduce "Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.


Cannot find the paper you are looking for? You can Submit a new open access paper.