Search Results for author: Natasha Jaques

Found 23 papers, 15 papers with code

Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control

no code implementations • ICML 2017 • Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck

This paper proposes a general method for improving the structure and quality of sequences generated by a recurrent neural network (RNN), while maintaining information originally learned from data, as well as sample diversity.

Reinforcement Learning (RL)

Paper
Add Code

Multimodal Autoencoder: A Deep Learning Approach to Filling In Missing Sensor Data and Enabling Better Mood Prediction

1 code implementation • 26 Oct 2017 • Natasha Jaques, Sara Taylor, Akane Sano, Rosalind W. Picard

To accomplish forecasting of mood in real-world situations, affective computing systems need to collect and learn from multimodal data collected over weeks or months of daily use.

Denoising

Paper
Code

Learning via social awareness: Improving a deep generative sketching model with facial feedback

no code implementations • 13 Feb 2018 • Natasha Jaques, Jennifer McCleary, Jesse Engel, David Ha, Fred Bertsch, Rosalind Picard, Douglas Eck

We use a Latent Constraints GAN (LC-GAN) to learn from the facial feedback of a small group of viewers, by optimizing the model to produce sketches that it predicts will lead to more positive facial expressions.

Paper
Add Code

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

3 code implementations • ICLR 2019 • Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

We propose a unified mechanism for achieving coordination and communication in Multi-Agent Reinforcement Learning (MARL), through rewarding agents for having causal influence over other agents' actions.

counterfactual Counterfactual Reasoning +3

372

Paper
Code

Intrinsic Social Motivation via Causal Influence in Multi-Agent RL

no code implementations • ICLR 2019 • Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

Therefore, we also employ influence to train agents to use an explicit communication channel, and find that it leads to more effective communication and higher collective reward.

counterfactual Counterfactual Reasoning +2

Paper
Add Code

Tackling Climate Change with Machine Learning

3 code implementations • 10 Jun 2019 • David Rolnick, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D. Sherwin, S. Karthik Mukkavilli, Konrad P. Kording, Carla Gomes, Andrew Y. Ng, Demis Hassabis, John C. Platt, Felix Creutzig, Jennifer Chayes, Yoshua Bengio

Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help.

BIG-bench Machine Learning Management

Paper
Code

Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems

2 code implementations • NeurIPS 2019 • Asma Ghandeharioun, Judy Hanwen Shen, Natasha Jaques, Craig Ferguson, Noah Jones, Agata Lapedriza, Rosalind Picard

To investigate the strengths of this novel metric and interactive evaluation in comparison to state-of-the-art metrics and human evaluation of static conversations, we perform extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures through sentiment and semantic knowledge distillation on the utterance level.

Dialogue Evaluation Knowledge Distillation

175

Paper
Code

Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog

1 code implementation • 30 Jun 2019 • Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard

Most deep reinforcement learning (RL) systems are not able to learn effectively from off-policy data, especially if they cannot explore online in the environment.

Open-Domain Dialog Q-Learning +2

175

Paper
Code

Hierarchical Reinforcement Learning for Open-Domain Dialog

1 code implementation • 17 Sep 2019 • Abdelrhman Saleh, Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Rosalind Picard

Open-domain dialog generation is a challenging problem; maximum likelihood training can lead to repetitive outputs, models have difficulty tracking long-term conversational goals, and training on standard movie or online datasets may lead to the generation of inappropriate, biased, or offensive text.

Hierarchical Reinforcement Learning Open-Domain Dialog +2

175

Paper
Code

Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog

no code implementations • ICLR 2020 • Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard

This is a critical shortcoming for applying RL to real-world problems where collecting data is expensive, and models must be tested offline before being deployed to interact with the environment -- e. g. systems that learn from human interaction.

OpenAI Gym Open-Domain Dialog +3

Paper
Add Code

Emergent Social Learning via Multi-agent Reinforcement Learning

no code implementations • 1 Oct 2020 • Kamal Ndousse, Douglas Eck, Sergey Levine, Natasha Jaques

We analyze the reasons for this deficiency, and show that by imposing constraints on the training environment and introducing a model-based auxiliary loss we are able to obtain generalized social learning policies which enable agents to: i) discover complex skills that are not learned from single-agent training, and ii) adapt online to novel environments by taking cues from experts present in the new environment.

Imitation Learning Multi-agent Reinforcement Learning +2

Paper
Add Code

Human-centric Dialog Training via Offline Reinforcement Learning

1 code implementation • EMNLP 2020 • Natasha Jaques, Judy Hanwen Shen, Asma Ghandeharioun, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Shane Gu, Rosalind Picard

We start by hosting models online, and gather human feedback from real-time, open-ended conversations, which we then use to train and improve the models using offline reinforcement learning (RL).

Language Modelling Offline RL +2

175

Paper
Code

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

6 code implementations • NeurIPS 2020 • Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexandre Bayen, Stuart Russell, Andrew Critch, Sergey Levine

We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED).

Reinforcement Learning (RL) Transfer Learning +1

32,743

Paper
Code

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

1 code implementation • 24 Feb 2021 • Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar

This allows us to disentangle shared features and dynamics of the environment from agent-specific rewards and policies.

Autonomous Driving reinforcement-learning +1

Paper
Code

Adversarial Environment Generation for Learning to Navigate the Web

1 code implementation • 2 Mar 2021 • Izzeddin Gur, Natasha Jaques, Kevin Malta, Manoj Tiwari, Honglak Lee, Aleksandra Faust

The regret objective trains the adversary to design a curriculum of environments that are "just-the-right-challenge" for the navigator agents; our results show that over time, the adversary learns to generate increasingly complex web navigation tasks.

Benchmarking Decision Making +2

32,753

Paper
Code

Joint Attention for Multi-Agent Coordination and Social Learning

no code implementations • 15 Apr 2021 • Dennis Lee, Natasha Jaques, Chase Kew, Jiaxing Wu, Douglas Eck, Dale Schuurmans, Aleksandra Faust

We then train agents to minimize the difference between the attention weights that they apply to the environment at each timestep, and the attention of other agents.

Inductive Bias Reinforcement Learning (RL)

Paper
Add Code

Explore and Control with Adversarial Surprise

1 code implementation • ICML Workshop URL 2021 • Arnaud Fickinger, Natasha Jaques, Samyak Parajuli, Michael Chang, Nicholas Rhinehart, Glen Berseth, Stuart Russell, Sergey Levine

Unsupervised reinforcement learning (RL) studies how to leverage environment statistics to learn useful behaviors without the cost of reward engineering.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Less is More: Generating Grounded Navigation Instructions from Landmarks

no code implementations • CVPR 2022 • Su Wang, Ceslee Montgomery, Jordi Orbay, Vighnesh Birodkar, Aleksandra Faust, Izzeddin Gur, Natasha Jaques, Austin Waters, Jason Baldridge, Peter Anderson

We study the automatic generation of navigation instructions from 360-degree images captured on indoor routes.

Instruction Following Visual Grounding

Paper
Add Code

Environment Generation for Zero-Shot Compositional Reinforcement Learning

1 code implementation • NeurIPS 2021 • Izzeddin Gur, Natasha Jaques, Yingjie Miao, Jongwook Choi, Manoj Tiwari, Honglak Lee, Aleksandra Faust

We learn to generate environments composed of multiple pages or rooms, and train RL agents capable of completing wide-range of complex tasks in those environments.

Navigate reinforcement-learning +1

32,745

Paper
Code

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

1 code implementation • 9 Aug 2022 • Marwa Abdulhai, Natasha Jaques, Sergey Levine

IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration

no code implementations • 29 Nov 2022 • Srivatsan Krishnan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, Aleksandra Faust

It is unclear how scalable single-agent formulations are as we increase the complexity of the design space (e. g., full stack System-on-Chip design).

Compiler Optimization Multi-agent Reinforcement Learning +2

Paper
Add Code

Impossibility Theorems for Feature Attribution

1 code implementation • 22 Dec 2022 • Blair Bilodeau, Natasha Jaques, Pang Wei Koh, Been Kim

Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods.

Paper
Code

Moral Foundations of Large Language Models

1 code implementation • 23 Oct 2023 • Marwa Abdulhai, Gregory Serapio-Garcia, Clément Crepy, Daria Valter, John Canny, Natasha Jaques

Finally, we show that we can adversarially select prompts that encourage the moral to exhibit a particular set of moral foundations, and that this can affect the model's behavior on downstream tasks.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.