Search Results for author: Jason Weston

Found 93 papers, 47 papers with code

The CRINGE Loss: Learning what language not to model

no code implementations10 Nov 2022 Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples.

Language Modelling

When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

no code implementations28 Oct 2022 Weiyan Shi, Emily Dinan, Kurt Shuster, Jason Weston, Jing Xu

Deployed dialogue agents have the potential to integrate human feedback to continuously improve themselves.

Chatbot

Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls

no code implementations5 Aug 2022 Da Ju, Jing Xu, Y-Lan Boureau, Jason Weston

The promise of interaction between intelligent conversational agents and humans is that models can learn from such feedback in order to improve.

Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback

no code implementations5 Aug 2022 Jing Xu, Megan Ung, Mojtaba Komeili, Kushal Arora, Y-Lan Boureau, Jason Weston

We then study various algorithms for improving from such feedback, including standard supervised learning, rejection sampling, model-guiding and reward-based learning, in order to make recommendations on which type of feedback and algorithms work best.

Retrieval

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

1 code implementation5 Aug 2022 Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

DIRECTOR: Generator-Classifiers For Supervised Language Modeling

1 code implementation15 Jun 2022 Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions.

Language Modelling

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion

1 code implementation24 Mar 2022 Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston

We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness.

Language Modelling Retrieval

Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity

no code implementations Findings (NAACL) 2022 Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction.

NormFormer: Improved Transformer Pretraining with Extra Normalization

1 code implementation18 Oct 2021 Sam Shleifer, Jason Weston, Myle Ott

The extra operations incur negligible compute cost (+0. 4% parameter increase), but improve pretraining perplexity and downstream task performance for both causal and masked language models ranging from 125 Million to 2. 7 Billion parameters.

Language Modelling Masked Language Modeling

Beyond Goldfish Memory: Long-Term Open-Domain Conversation

no code implementations ACL 2022 Jing Xu, Arthur Szlam, Jason Weston

Despite recent improvements in open-domain dialogue models, state of the art models are trained and evaluated on short conversations with little context.

Retrieval

Internet-Augmented Dialogue Generation

no code implementations ACL 2022 Mojtaba Komeili, Kurt Shuster, Jason Weston

The largest store of continually updating knowledge on our planet can be accessed via internet search.

Dialogue Generation Retrieval

Staircase Attention for Recurrent Processing of Sequences

1 code implementation8 Jun 2021 Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason Weston

Attention mechanisms have become a standard tool for sequence modeling tasks, in particular by stacking self-attention layers over the entire input sequence as in the Transformer architecture.

Language Modelling

Hash Layers For Large Sparse Models

no code implementations NeurIPS 2021 Stephen Roller, Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston

We investigate the training of sparse layers that use different parameters for different inputs based on hashing in large Transformer models.

Language Modelling

Bot-Adversarial Dialogue for Safe Conversational Agents

no code implementations NAACL 2021 Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

Conversational agents trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior.

Not All Memories are Created Equal: Learning to Forget by Expiring

1 code implementation13 May 2021 Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan

We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve strong performance on reinforcement learning tasks specifically designed to challenge this functionality.

Language Modelling

Retrieval Augmentation Reduces Hallucination in Conversation

no code implementations Findings (EMNLP) 2021 Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston

Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020).

Retrieval

I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling

no code implementations ACL 2021 Yixin Nie, Mary Williamson, Mohit Bansal, Douwe Kiela, Jason Weston

To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.

Natural Language Understanding

Recipes for Safety in Open-domain Chatbots

no code implementations14 Oct 2020 Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases.

Multi-Modal Open-Domain Dialogue

no code implementations EMNLP 2021 Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston

Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020).

Visual Dialog

Deploying Lifelong Open-Domain Dialogue Learning

no code implementations18 Aug 2020 Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston

As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013).

Image-Chat: Engaging Grounded Conversations

no code implementations ACL 2020 Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston

To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

no code implementations22 Jun 2020 Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.

Continual Learning

Multi-Dimensional Gender Bias Classification

no code implementations EMNLP 2020 Emily Dinan, Angela Fan, Ledell Wu, Jason Weston, Douwe Kiela, Adina Williams

We show our classifiers prove valuable for a variety of important applications, such as controlling for gender bias in generative models, detecting gender bias in arbitrary text, and shed light on offensive language in terms of genderedness.

Classification General Classification

Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring

2 code implementations ICLR 2020 Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston

The use of deep pre-trained transformers has led to remarkable progress in a number of applications (Devlin et al., 2018).

All-in-One Image-Grounded Conversational Agents

no code implementations28 Dec 2019 Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston

As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore.

Improving Conditioning in Context-Aware Sequence to Sequence Models

no code implementations21 Nov 2019 Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite

Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence.

abstractive question answering Data Augmentation +2

Generating Interactive Worlds with Text

no code implementations20 Nov 2019 Angela Fan, Jack Urbanek, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye, Douwe Kiela, Tim Rocktaschel, Arthur Szlam, Jason Weston

We show that the game environments created with our approach are cohesive, diverse, and preferred by human evaluators compared to other machine learning based world construction algorithms.

BIG-bench Machine Learning Common Sense Reasoning

Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training

1 code implementation ACL 2020 Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston

Generative dialogue models currently suffer from a number of problems which standard maximum likelihood training does not address.

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

no code implementations ACL 2020 Kurt Shuster, Da Ju, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston

We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images.

Finding Generalizable Evidence by Learning to Convince Q\&A Models

no code implementations IJCNLP 2019 Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed.

Question Answering

Adversarial NLI: A New Benchmark for Natural Language Understanding

2 code implementations ACL 2020 Yixin Nie, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, Douwe Kiela

We introduce a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure.

Natural Language Understanding

Finding Generalizable Evidence by Learning to Convince Q&A Models

1 code implementation12 Sep 2019 Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed.

Question Answering

Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue

1 code implementation IJCNLP 2019 Dongyeop Kang, Anusha Balakrishnan, Pararth Shah, Paul Crook, Y-Lan Boureau, Jason Weston

These issues can be alleviated by treating recommendation as an interactive dialogue task instead, where an expert recommender can sequentially ask about someone's preferences, react to their requests, and recommend more appropriate items.

Recommendation Systems

ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons

no code implementations6 Sep 2019 Margaret Li, Jason Weston, Stephen Roller

While dialogue remains an important end-goal of natural language research, the difficulty of evaluation is an oft-quoted reason why it remains troublesome to make real progress towards its solution.

Dialogue Evaluation

Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack

no code implementations IJCNLP 2019 Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston

The detection of offensive language in the context of a dialogue has become an increasingly important application of natural language processing.

Neural Text Generation with Unlikelihood Training

3 code implementations ICLR 2020 Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, Jason Weston

Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core.

Text Generation

ELI5: Long Form Question Answering

2 code implementations ACL 2019 Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli

We introduce the first large-scale corpus for long-form question answering, a task requiring elaborate and in-depth answers to open-ended questions.

Language Modelling Long Form Question Answering

Why Build an Assistant in Minecraft?

1 code implementation22 Jul 2019 Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Natural Language Understanding

Learning to Speak and Act in a Fantasy Text Adventure Game

no code implementations IJCNLP 2019 Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston

We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.

Retrieval

What makes a good conversation? How controllable attributes affect human judgments

1 code implementation NAACL 2019 Abigail See, Stephen Roller, Douwe Kiela, Jason Weston

A good conversation requires balance -- between simplicity and detail; staying on topic and changing it; asking questions and answering them.

Specificity Text Generation

Wizard of Wikipedia: Knowledge-Powered Conversational agents

3 code implementations ICLR 2019 Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, Jason Weston

In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date.

Dialogue Generation

Image Chat: Engaging Grounded Conversations

3 code implementations2 Nov 2018 Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston

To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).

Engaging Image Captioning Via Personality

no code implementations CVPR 2019 Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, Jason Weston

While such tasks are useful to verify that a machine understands the content of an image, they are not engaging to humans as captions.

Image Captioning

Jump to better conclusions: SCAN both left and right

1 code implementation WS 2018 Jasmijn Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela

Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models.

Retrieve and Refine: Improved Sequence Generation Models For Dialogue

1 code implementation WS 2018 Jason Weston, Emily Dinan, Alexander H. Miller

Sequence generation models for dialogue are known to have several problems: they tend to produce short, generic sentences that are uninformative and unengaging.

Retrieval

Talk the Walk: Navigating New York City through Grounded Dialogue

1 code implementation9 Jul 2018 Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela

We introduce "Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.

Navigate

Personalizing Dialogue Agents: I have a dog, do you have pets too?

13 code implementations ACL 2018 Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, Jason Weston

Chit-chat models are known to have several problems: they lack specificity, do not display a consistent personality and are often not very captivating.

Dialogue Generation Specificity

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

no code implementations ICLR 2018 Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston

Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment.

Grounded language learning

Emergent Translation in Multi-Agent Communication

no code implementations ICLR 2018 Jason Lee, Kyunghyun Cho, Jason Weston, Douwe Kiela

While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans.

Machine Translation Translation

StarSpace: Embed All The Things!

2 code implementations12 Sep 2017 Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Collaborative Filtering Text Classification +1

ParlAI: A Dialog Research Software Platform

20 code implementations EMNLP 2017 Alexander H. Miller, Will Feng, Adam Fisch, Jiasen Lu, Dhruv Batra, Antoine Bordes, Devi Parikh, Jason Weston

We introduce ParlAI (pronounced "par-lay"), an open-source software platform for dialog research implemented in Python, available at http://parl. ai.

reinforcement-learning Visual Question Answering

Reading Wikipedia to Answer Open-Domain Questions

9 code implementations ACL 2017 Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes

This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article.

Open-Domain Question Answering Reading Comprehension +1

Learning through Dialogue Interactions by Asking Questions

2 code implementations15 Dec 2016 Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction.

reinforcement-learning

Tracking the World State with Recurrent Entity Networks

4 code implementations12 Dec 2016 Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann Lecun

The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting.

Procedural Text Understanding Question Answering

Dialogue Learning With Human-In-The-Loop

2 code implementations29 Nov 2016 Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes.

Question Answering reinforcement-learning

Learning End-to-End Goal-Oriented Dialog

6 code implementations24 May 2016 Antoine Bordes, Y-Lan Boureau, Jason Weston

We show similar result patterns on data extracted from an online concierge service.

dialog state tracking Goal-Oriented Dialog +1

Large-scale Simple Question Answering with Memory Networks

3 code implementations5 Jun 2015 Antoine Bordes, Nicolas Usunier, Sumit Chopra, Jason Weston

Training large-scale question answering systems is complicated because training sources usually cover a small portion of the range of possible questions.

 Ranked #1 on Question Answering on WebQuestions (F1 metric)

Question Answering Transfer Learning

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

19 code implementations19 Feb 2015 Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, Tomas Mikolov

One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.

Question Answering Reading Comprehension

Memory Networks

5 code implementations15 Oct 2014 Jason Weston, Sumit Chopra, Antoine Bordes

We describe a new class of learning models called memory networks.

Question Answering

Question Answering with Subgraph Embeddings

1 code implementation EMNLP 2014 Antoine Bordes, Sumit Chopra, Jason Weston

Training our system using pairs of questions and structured representations of their answers, and pairs of question paraphrases, yields competitive results on a competitive benchmark of the literature.

Question Answering

Open Question Answering with Weakly Supervised Embedding Models

no code implementations16 Apr 2014 Antoine Bordes, Jason Weston, Nicolas Usunier

Building computers able to answer questions on any subject is a long standing goal of artificial intelligence.

Question Answering

Translating Embeddings for Modeling Multi-relational Data

4 code implementations NeurIPS 2013 Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, Oksana Yakhnenko

We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces.

Link Prediction

Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction

no code implementations EMNLP 2013 Jason Weston, Antoine Bordes, Oksana Yakhnenko, Nicolas Usunier

This paper proposes a novel approach for relation extraction from free text which is trained to jointly use information from the text and from existing knowledge.

Relation Extraction

Irreflexive and Hierarchical Relations as Translations

no code implementations26 Apr 2013 Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, Oksana Yakhnenko

We consider the problem of embedding entities and relations of knowledge bases in low-dimensional vector spaces.

A Semantic Matching Energy Function for Learning with Multi-relational Data

no code implementations15 Jan 2013 Xavier Glorot, Antoine Bordes, Jason Weston, Yoshua Bengio

Large-scale relational learning becomes crucial for handling the huge amounts of structured data generated daily in many application domains ranging from computational biology or information retrieval, to natural language processing.

Information Retrieval Link Prediction +2

Natural Language Processing (almost) from Scratch

1 code implementation2 Mar 2011 Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, Pavel Kuksa

We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling.

Chunking named-entity-recognition +3

Label Embedding Trees for Large Multi-Class Tasks

no code implementations NeurIPS 2010 Samy Bengio, Jason Weston, David Grangier

Multi-class classification becomes challenging at test time when the number of classes is very large and testing against every possible class can become computationally infeasible.

General Classification Multi-class Classification

Polynomial Semantic Indexing

no code implementations NeurIPS 2009 Bing Bai, Jason Weston, David Grangier, Ronan Collobert, Kunihiko Sadamasa, Yanjun Qi, Corinna Cortes, Mehryar Mohri

We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score.

Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.