Search Results for author: Igor Mordatch

Found 71 papers, 35 papers with code

A Game Theoretic Perspective on Model-Based Reinforcement Learning

no code implementations • ICML 2020 • Aravind Rajeswaran, Igor Mordatch, Vikash Kumar

We point out that a large class of MBRL algorithms can be viewed as a game between two players: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player.

Continuous Control Model-based Reinforcement Learning +2

Paper
Add Code

Learning General-Purpose Controllers via Locally Communicating Sensorimotor Modules

no code implementations • ICML 2020 • Wenlong Huang, Igor Mordatch, Deepak Pathak

We observe a wide variety of drastically diverse locomotion styles across morphologies as well as centralized coordination emerging via message passing between decentralized modules purely from the reinforcement learning objective.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

no code implementations • 11 Dec 2023 • Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel

To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times.

Math

Paper
Add Code

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

1 code implementation • 21 Nov 2023 • Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM).

Paper
Code

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

no code implementations • 8 Nov 2023 • C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant, Peter J. Liu, Roman Novak, Yundi Qian, Noah Fiedel, Jascha Sohl-Dickstein

We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment.

Language Modelling

Paper
Add Code

Scalable Diffusion for Materials Generation

no code implementations • 18 Oct 2023 • Mengjiao Yang, KwangHwan Cho, Amil Merchant, Pieter Abbeel, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk

Lastly, we show that conditional generation with UniMat can scale to previously established crystal datasets with up to millions of crystals structures, outperforming random structure search (the current leading method for structure discovery) in discovering new stable materials.

Formation Energy

Paper
Add Code

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.

Object Question Answering +1

265

Paper
Code

Improving Factuality and Reasoning in Language Models through Multiagent Debate

1 code implementation • 23 May 2023 • Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch

Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.

Few-Shot Learning Language Modelling +1

261

Paper
Code

Masked Trajectory Models for Prediction, Representation, and Control

1 code implementation • 4 May 2023 • Philipp Wu, Arjun Majumdar, Kevin Stone, Yixin Lin, Igor Mordatch, Pieter Abbeel, Aravind Rajeswaran

We introduce Masked Trajectory Models (MTM) as a generic abstraction for sequential decision making.

Continuous Control Decision Making +2

138

Paper
Code

Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning

no code implementations • 27 Mar 2023 • Satoshi Kataoka, Youngseog Chung, Seyed Kamyar Seyed Ghasemipour, Pannag Sanketi, Shixiang Shane Gu, Igor Mordatch

Without manually-designed controller nor human demonstrations, we demonstrate that with careful Sim2Real considerations, our policies trained with RL in simulation enable two xArm6 robots to solve the U-shape assembly task with a success rate of above90% in simulation, and 50% on real hardware without any additional real-world fine-tuning.

Collision Avoidance reinforcement-learning +1

Paper
Add Code

PaLM-E: An Embodied Multimodal Language Model

2 code implementations • 6 Mar 2023 • Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

Large language models excel at a wide range of complex tasks.

Ranked #2 on Visual Question Answering (VQA) on OK-VQA

Language Modelling Large Language Model +2

200

Paper
Code

Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents

no code implementations • NeurIPS 2023 • Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

Recent progress in large language models (LLMs) has demonstrated the ability to learn and leverage Internet-scale knowledge through pre-training with autoregressive models.

Language Modelling Text Generation

Paper
Add Code

RT-1: Robotics Transformer for Real-World Control at Scale

1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.

1,179

Paper
Code

Melting Pot 2.0

2 code implementations • 24 Nov 2022 • John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.

Artificial Life Navigate

531

Paper
Code

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

no code implementations • 23 Nov 2022 • David Venuto, Sherry Yang, Pieter Abbeel, Doina Precup, Igor Mordatch, Ofir Nachum

Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications.

Decision Making

Paper
Add Code

VeLO: Training Versatile Learned Optimizers by Scaling Up

1 code implementation • 17 Nov 2022 • Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers.

729

Paper
Code

Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

no code implementations • 24 Oct 2022 • Hao liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks.

Language Modelling Natural Language Inference +1

Paper
Add Code

Implicit Offline Reinforcement Learning via Supervised Learning

no code implementations • 21 Oct 2022 • Alexandre Piche, Rafael Pardinas, David Vazquez, Igor Mordatch, Chris Pal

Despite the benefits of using implicit models to learn robotic skills via BC, offline RL via Supervised Learning algorithms have been limited to explicit models.

Offline RL reinforcement-learning +1

Paper
Add Code

Composing Ensembles of Pre-trained Models via Iterative Consensus

no code implementations • 20 Oct 2022 • Shuang Li, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Igor Mordatch

Such closed-loop communication enables models to correct errors caused by other models, significantly boosting performance on downstream tasks, e. g. improving accuracy on grade school math problems by 7. 5%, without requiring any model finetuning.

Ranked #1 on Video Question Answering on ActivityNet-QA

Arithmetic Reasoning Image Generation +4

Paper
Add Code

Inner Monologue: Embodied Reasoning through Planning with Language Models

no code implementations • 12 Jul 2022 • Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter

We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction.

Paper
Add Code

Learning Iterative Reasoning through Energy Minimization

1 code implementation • 30 Jun 2022 • Yilun Du, Shuang Li, Joshua B. Tenenbaum, Igor Mordatch

Finally, we illustrate that our approach can recursively solve algorithmic problems requiring nested reasoning

Image Classification Object Recognition

Paper
Code

Multi-Game Decision Transformers

1 code implementation • 30 May 2022 • Kuang-Huei Lee, Ofir Nachum, Mengjiao Yang, Lisa Lee, Daniel Freeman, Winnie Xu, Sergio Guadarrama, Ian Fischer, Eric Jang, Henryk Michalewski, Igor Mordatch

Specifically, we show that a single transformer-based model - with a single set of weights - trained purely offline can play a suite of up to 46 Atari games simultaneously at close-to-human performance.

Atari Games Offline RL

32,763

Paper
Code

Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

no code implementations • 15 Mar 2022 • Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch

Despite the simplicity of this objective, the compositional nature of building diverse blueprints from a set of blocks leads to an explosion of complexity in structures that agents encounter.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement Learning

no code implementations • 15 Mar 2022 • Satoshi Kataoka, Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Igor Mordatch

Most successes in robotic manipulation have been restricted to single-arm robots, which limits the range of solvable tasks to pick-and-place, insertion, and objects rearrangement.

Collision Avoidance reinforcement-learning +1

Paper
Add Code

Pre-Trained Language Models for Interactive Decision-Making

1 code implementation • 3 Feb 2022 • Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu

Together, these results suggest that language modeling induces representations that are useful for modeling not just language, but also goals and plans; these representations can aid learning and generalization even outside of language processing.

Imitation Learning Language Modelling

384

Paper
Code

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

1 code implementation • 18 Jan 2022 • Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch

However, the plans produced naively by LLMs often cannot map precisely to admissible actions.

Robot Task Planning World Knowledge

227

Paper
Code

Unsupervised Learning of Compositional Energy Concepts

1 code implementation • NeurIPS 2021 • Yilun Du, Shuang Li, Yash Sharma, Joshua B. Tenenbaum, Igor Mordatch

In this work, we propose COMET, which discovers and represents concepts as separate energy functions, enabling us to represent both global concepts as well as objects under a unified framework.

Disentanglement Unsupervised Image Decomposition

Paper
Code

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning

no code implementations • 4 Nov 2021 • Wenlong Huang, Igor Mordatch, Pieter Abbeel, Deepak Pathak

We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects and generalize to new objects with unseen shape or size.

Multi-Task Learning Object +2

Paper
Add Code

The Neural MMO Platform for Massively Multiagent Research

no code implementations • 14 Oct 2021 • Joseph Suarez, Yilun Du, Clare Zhu, Igor Mordatch, Phillip Isola

Neural MMO is a computationally accessible research platform that combines large agent populations, long time horizons, open-ended tasks, and modular game systems.

Paper
Add Code

Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers

no code implementations • 29 Sep 2021 • Catherine Cang, Kourosh Hakhamaneshi, Ryan Rudes, Igor Mordatch, Aravind Rajeswaran, Pieter Abbeel, Michael Laskin

In this paper, we investigate how we can leverage large reward-free (i. e. task-agnostic) offline datasets of prior interactions to pre-train agents that can then be fine-tuned using a small reward-annotated dataset.

D4RL Offline RL +2

Paper
Add Code

Language Model Pre-training Improves Generalization in Policy Learning

no code implementations • 29 Sep 2021 • Shuang Li, Xavier Puig, Yilun Du, Ekin Akyürek, Antonio Torralba, Jacob Andreas, Igor Mordatch

Additional experiments explore the role of language-based encodings in these results; we find that it is possible to train a simple adapter layer that maps from observations and action histories to LM embeddings, and thus that language modeling provides an effective initializer even for tasks with no language as input or output.

Imitation Learning Language Modelling

Paper
Add Code

Pretraining for Language Conditioned Imitation with Transformers

no code implementations • 29 Sep 2021 • Aaron L Putterman, Kevin Lu, Igor Mordatch, Pieter Abbeel

We study reinforcement learning (RL) agents which can utilize language inputs.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

How to Adapt Your Large-Scale Vision-and-Language Model

no code implementations • 29 Sep 2021 • Konwoo Kim, Michael Laskin, Igor Mordatch, Deepak Pathak

Finally, we provide an empirical analysis and recommend general recipes for efficient transfer learning of vision and language models.

Image Classification Language Modelling +1

Paper
Add Code

Implicit Behavioral Cloning

4 code implementations • 1 Sep 2021 • Pete Florence, Corey Lynch, Andy Zeng, Oscar Ramirez, Ayzaan Wahid, Laura Downs, Adrian Wong, Johnny Lee, Igor Mordatch, Jonathan Tompson

We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit models.

D4RL

2,515

Paper
Code

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

no code implementations • 14 Jul 2021 • Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks).

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation

1 code implementation • 24 Jun 2021 • C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, Olivier Bachem

We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX.

OpenAI Gym reinforcement-learning +1

2,056

Paper
Code

Model-Based Reinforcement Learning via Latent-Space Collocation

1 code implementation • 24 Jun 2021 • Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Decision Transformer: Reinforcement Learning via Sequence Modeling

16 code implementations • NeurIPS 2021 • Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.

Ranked #3 on Offline RL on D4RL

Atari Games D4RL +5

2,515

Paper
Code

Pretrained Transformers as Universal Computation Engines

4 code implementations • 9 Mar 2021 • Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks.

239

Paper
Code

Reset-Free Lifelong Learning with Skill-Space Planning

1 code implementation • ICLR 2021 • Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space of higher-order skills.

Reinforcement Learning (RL)

Paper
Code

Improved Contrastive Divergence Training of Energy Based Models

4 code implementations • 2 Dec 2020 • Yilun Du, Shuang Li, Joshua Tenenbaum, Igor Mordatch

Contrastive divergence is a popular method of training energy-based models, but is known to have difficulties with training stability.

Data Augmentation Image Generation +1

Paper
Code

Compositional Visual Generation with Energy Based Models

no code implementations • NeurIPS 2020 • Yilun Du, Shuang Li, Igor Mordatch

A vital aspect of human intelligence is the ability to compose increasingly complex concepts out of simpler ideas, enabling both rapid learning and adaptation of knowledge.

Paper
Add Code

Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation • NeurIPS 2020 • Michael Janner, Igor Mordatch, Sergey Levine

We introduce the gamma-model, a predictive model of environment dynamics with an infinite, probabilistic horizon.

Generative Adversarial Network

Paper
Code

Energy-Based Models for Continual Learning

1 code implementation • 24 Nov 2020 • Shuang Li, Yilun Du, Gido M. van de Ven, Igor Mordatch

We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems.

Continual Learning

Paper
Code

Rearrangement: A Challenge for Embodied AI

no code implementations • 3 Nov 2020 • Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su

In the rearrangement task, the goal is to bring a given physical environment into a specified state.

Benchmarking

Paper
Add Code

Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation • 27 Oct 2020 • Michael Janner, Igor Mordatch, Sergey Levine

We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.

Generative Adversarial Network

Paper
Code

One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control

2 code implementations • ICML 2020 • Wenlong Huang, Igor Mordatch, Deepak Pathak

We observe that a wide variety of drastically diverse locomotion styles across morphologies as well as centralized coordination emerges via message passing between decentralized modules purely from the reinforcement learning objective.

reinforcement-learning Reinforcement Learning (RL)

204

Paper
Code

A Game Theoretic Framework for Model Based Reinforcement Learning

no code implementations • 16 Apr 2020 • Aravind Rajeswaran, Igor Mordatch, Vikash Kumar

Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Compositional Visual Generation and Inference with Energy Based Models

1 code implementation • 13 Apr 2020 • Yilun Du, Shuang Li, Igor Mordatch

A vital aspect of human intelligence is the ability to compose increasingly complex concepts out of simpler ideas, enabling both rapid learning and adaptation of knowledge.

Paper
Code

Neural MMO v1.3: A Massively Multiagent Game Environment for Training and Evaluating Neural Networks

no code implementations • 31 Jan 2020 • Joseph Suarez, Yilun Du, Igor Mordatch, Phillip Isola

We present Neural MMO, a massively multiagent game environment inspired by MMOs and discuss our progress on two more general challenges in multiagent systems engineering for AI research: distributed infrastructure and game IO.

Policy Gradient Methods

Paper
Add Code

Adaptive Online Planning for Continual Lifelong Learning

1 code implementation • 3 Dec 2019 • Kevin Lu, Igor Mordatch, Pieter Abbeel

We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change.

Paper
Code

Implicit Generation and Modeling with Energy Based Models

1 code implementation • NeurIPS 2019 • Yilun Du, Igor Mordatch

Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train.

Ranked #3 on Image Generation on Stacked MNIST

General Classification Image Generation +2

331

Paper
Code

Emergent Tool Use From Multi-Agent Autocurricula

3 code implementations • ICLR 2020 • Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, Igor Mordatch

Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination.

reinforcement-learning Reinforcement Learning (RL)

1,580

Paper
Code

Model Based Planning with Energy Based Models

no code implementations • 15 Sep 2019 • Yilun Du, Toru Lin, Igor Mordatch

We provide an online algorithm to train EBMs while interacting with the environment, and show that EBMs allow for significantly better online learning than corresponding feed-forward networks.

Reinforcement Learning (RL)

Paper
Add Code

Neural MMO: A massively multiplayer game environment for intelligent agents

no code implementations • ICLR 2019 • Joseph Suarez, Yilun Du, Phillip Isola, Igor Mordatch

We demonstrate how this platform can be used to study behavior and learning in large populations of neural agents.

Paper
Add Code

Implicit Generation and Generalization in Energy-Based Models

3 code implementations • 20 Mar 2019 • Yilun Du, Igor Mordatch

Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train.

General Classification Image Reconstruction +1

331

Paper
Code

Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents

1 code implementation • 2 Mar 2019 • Joseph Suarez, Yilun Du, Phillip Isola, Igor Mordatch

The emergence of complex life on Earth is often attributed to the arms race that ensued from a huge number of organisms all competing for finite resources.

1,527

Paper
Code

Multi-Agent Reinforcement Learning with Multi-Step Generative Models

no code implementations • 29 Jan 2019 • Orr Krupnik, Igor Mordatch, Aviv Tamar

We consider model-based reinforcement learning (MBRL) in 2-agent, high-fidelity continuous control problems -- an important domain for robots interacting with other agents in the same workspace.

Continuous Control Decision Making +5

Paper
Add Code

Concept Learning with Energy-Based Models

no code implementations • 6 Nov 2018 • Igor Mordatch

Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning.

Paper
Add Code

Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control

no code implementations • ICLR 2019 • Kendall Lowrey, Aravind Rajeswaran, Sham Kakade, Emanuel Todorov, Igor Mordatch

We study how local trajectory optimization can cope with approximation errors in the value function, and can stabilize and accelerate value function learning.

Paper
Add Code

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

no code implementations • ICLR 2018 • Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M. Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel

To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Interpretable and Pedagogical Examples

no code implementations • ICLR 2018 • Smitha Milli, Pieter Abbeel, Igor Mordatch

Teachers intentionally pick the most informative examples to show their students.

Paper
Add Code

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

1 code implementation • ICLR 2018 • Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch, Pieter Abbeel

Ability to continuously learn and adapt from limited experience in nonstationary environments is an important milestone on the path towards general intelligence.

Meta-Learning

296

Paper
Code

Emergent Complexity via Multi-Agent Competition

2 code implementations • ICLR 2018 • Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, Igor Mordatch

In this paper, we point out that a competitive multi-agent environment trained with self-play can produce behaviors that are far more complex than the environment itself.

Blocking

794

Paper
Code

Learning with Opponent-Learning Awareness

6 code implementations • 13 Sep 2017 • Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch

We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL.

Multi-agent Reinforcement Learning

137

Paper
Code

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

84 code implementations • NeurIPS 2017 • Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch

We explore deep reinforcement learning methods for multi-agent domains.

Ranked #1 on SMAC+ on Def_Infantry_sequential

Multi-agent Reinforcement Learning Q-Learning +3

30,998

Paper
Code

Emergence of Grounded Compositional Language in Multi-Agent Populations

1 code implementation • 15 Mar 2017 • Igor Mordatch, Pieter Abbeel

By capturing statistical patterns in large corpora, machine learning has enabled significant advances in natural language processing, including in machine translation, question answering, and sentiment analysis.

Machine Translation Question Answering +2

Paper
Code

Prediction and Control with Temporal Segment Models

no code implementations • ICML 2017 • Nikhil Mishra, Pieter Abbeel, Igor Mordatch

We introduce a method for learning the dynamics of complex nonlinear systems based on deep generative models over temporal segments of states and actions.

Paper
Add Code

A Paradigm for Situated and Goal-Driven Language Learning

no code implementations • 12 Oct 2016 • Jon Gauthier, Igor Mordatch

A distinguishing property of human intelligence is the ability to flexibly use language in order to communicate complex ideas with other humans in a variety of contexts.

Paper
Add Code

Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

no code implementations • 11 Oct 2016 • Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba

Nevertheless, often the overall gist of what the policy does in simulation remains valid in the real world.

Friction

Paper
Add Code

Interactive Control of Diverse Complex Characters with Neural Networks

no code implementations • NeurIPS 2015 • Igor Mordatch, Kendall Lowrey, Galen Andrew, Zoran Popovic, Emanuel V. Todorov

We present a method for training recurrent neural networks to act as near-optimal feedback controllers.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.