Search Results for author: Igor Mordatch

Found 71 papers, 35 papers with code

Learning General-Purpose Controllers via Locally Communicating Sensorimotor Modules

no code implementations ICML 2020 Wenlong Huang, Igor Mordatch, Deepak Pathak

We observe a wide variety of drastically diverse locomotion styles across morphologies as well as centralized coordination emerging via message passing between decentralized modules purely from the reinforcement learning objective.

reinforcement-learning Reinforcement Learning (RL)

A Game Theoretic Perspective on Model-Based Reinforcement Learning

no code implementations ICML 2020 Aravind Rajeswaran, Igor Mordatch, Vikash Kumar

We point out that a large class of MBRL algorithms can be viewed as a game between two players: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player.

Continuous Control Model-based Reinforcement Learning +2

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

1 code implementation21 Nov 2023 Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM).

Scalable Diffusion for Materials Generation

no code implementations18 Oct 2023 Mengjiao Yang, KwangHwan Cho, Amil Merchant, Pieter Abbeel, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk

Lastly, we show that conditional generation with UniMat can scale to previously established crystal datasets with up to millions of crystals structures, outperforming random structure search (the current leading method for structure discovery) in discovering new stable materials.

Formation Energy

Improving Factuality and Reasoning in Language Models through Multiagent Debate

1 code implementation23 May 2023 Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch

Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.

Few-Shot Learning Language Modelling +1

Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning

no code implementations27 Mar 2023 Satoshi Kataoka, Youngseog Chung, Seyed Kamyar Seyed Ghasemipour, Pannag Sanketi, Shixiang Shane Gu, Igor Mordatch

Without manually-designed controller nor human demonstrations, we demonstrate that with careful Sim2Real considerations, our policies trained with RL in simulation enable two xArm6 robots to solve the U-shape assembly task with a success rate of above90% in simulation, and 50% on real hardware without any additional real-world fine-tuning.

Collision Avoidance reinforcement-learning +1

Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents

no code implementations NeurIPS 2023 Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

Recent progress in large language models (LLMs) has demonstrated the ability to learn and leverage Internet-scale knowledge through pre-training with autoregressive models.

Language Modelling Text Generation

Melting Pot 2.0

2 code implementations24 Nov 2022 John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.

Artificial Life Navigate

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

no code implementations23 Nov 2022 David Venuto, Sherry Yang, Pieter Abbeel, Doina Precup, Igor Mordatch, Ofir Nachum

Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications.

Decision Making

VeLO: Training Versatile Learned Optimizers by Scaling Up

1 code implementation17 Nov 2022 Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers.

Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

no code implementations24 Oct 2022 Hao liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks.

Language Modelling Natural Language Inference +1

Implicit Offline Reinforcement Learning via Supervised Learning

no code implementations21 Oct 2022 Alexandre Piche, Rafael Pardinas, David Vazquez, Igor Mordatch, Chris Pal

Despite the benefits of using implicit models to learn robotic skills via BC, offline RL via Supervised Learning algorithms have been limited to explicit models.

Offline RL reinforcement-learning +1

Composing Ensembles of Pre-trained Models via Iterative Consensus

no code implementations20 Oct 2022 Shuang Li, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Igor Mordatch

Such closed-loop communication enables models to correct errors caused by other models, significantly boosting performance on downstream tasks, e. g. improving accuracy on grade school math problems by 7. 5%, without requiring any model finetuning.

Arithmetic Reasoning Image Generation +4

Learning Iterative Reasoning through Energy Minimization

1 code implementation30 Jun 2022 Yilun Du, Shuang Li, Joshua B. Tenenbaum, Igor Mordatch

Finally, we illustrate that our approach can recursively solve algorithmic problems requiring nested reasoning

Image Classification Object Recognition

Multi-Game Decision Transformers

1 code implementation30 May 2022 Kuang-Huei Lee, Ofir Nachum, Mengjiao Yang, Lisa Lee, Daniel Freeman, Winnie Xu, Sergio Guadarrama, Ian Fischer, Eric Jang, Henryk Michalewski, Igor Mordatch

Specifically, we show that a single transformer-based model - with a single set of weights - trained purely offline can play a suite of up to 46 Atari games simultaneously at close-to-human performance.

Atari Games Offline RL

Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement Learning

no code implementations15 Mar 2022 Satoshi Kataoka, Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Igor Mordatch

Most successes in robotic manipulation have been restricted to single-arm robots, which limits the range of solvable tasks to pick-and-place, insertion, and objects rearrangement.

Collision Avoidance reinforcement-learning +1

Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

no code implementations15 Mar 2022 Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch

Despite the simplicity of this objective, the compositional nature of building diverse blueprints from a set of blocks leads to an explosion of complexity in structures that agents encounter.

reinforcement-learning Reinforcement Learning (RL)

Pre-Trained Language Models for Interactive Decision-Making

1 code implementation3 Feb 2022 Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu

Together, these results suggest that language modeling induces representations that are useful for modeling not just language, but also goals and plans; these representations can aid learning and generalization even outside of language processing.

Imitation Learning Language Modelling

Unsupervised Learning of Compositional Energy Concepts

1 code implementation NeurIPS 2021 Yilun Du, Shuang Li, Yash Sharma, Joshua B. Tenenbaum, Igor Mordatch

In this work, we propose COMET, which discovers and represents concepts as separate energy functions, enabling us to represent both global concepts as well as objects under a unified framework.

Disentanglement Unsupervised Image Decomposition

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning

no code implementations4 Nov 2021 Wenlong Huang, Igor Mordatch, Pieter Abbeel, Deepak Pathak

We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects and generalize to new objects with unseen shape or size.

Multi-Task Learning Object +2

The Neural MMO Platform for Massively Multiagent Research

no code implementations14 Oct 2021 Joseph Suarez, Yilun Du, Clare Zhu, Igor Mordatch, Phillip Isola

Neural MMO is a computationally accessible research platform that combines large agent populations, long time horizons, open-ended tasks, and modular game systems.

How to Adapt Your Large-Scale Vision-and-Language Model

no code implementations29 Sep 2021 Konwoo Kim, Michael Laskin, Igor Mordatch, Deepak Pathak

Finally, we provide an empirical analysis and recommend general recipes for efficient transfer learning of vision and language models.

Image Classification Language Modelling +1

Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers

no code implementations29 Sep 2021 Catherine Cang, Kourosh Hakhamaneshi, Ryan Rudes, Igor Mordatch, Aravind Rajeswaran, Pieter Abbeel, Michael Laskin

In this paper, we investigate how we can leverage large reward-free (i. e. task-agnostic) offline datasets of prior interactions to pre-train agents that can then be fine-tuned using a small reward-annotated dataset.

D4RL Offline RL +2

Language Model Pre-training Improves Generalization in Policy Learning

no code implementations29 Sep 2021 Shuang Li, Xavier Puig, Yilun Du, Ekin Akyürek, Antonio Torralba, Jacob Andreas, Igor Mordatch

Additional experiments explore the role of language-based encodings in these results; we find that it is possible to train a simple adapter layer that maps from observations and action histories to LM embeddings, and thus that language modeling provides an effective initializer even for tasks with no language as input or output.

Imitation Learning Language Modelling

Implicit Behavioral Cloning

4 code implementations1 Sep 2021 Pete Florence, Corey Lynch, Andy Zeng, Oscar Ramirez, Ayzaan Wahid, Laura Downs, Adrian Wong, Johnny Lee, Igor Mordatch, Jonathan Tompson

We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit models.

D4RL

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

no code implementations14 Jul 2021 Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks).

Multi-agent Reinforcement Learning reinforcement-learning +1

Model-Based Reinforcement Learning via Latent-Space Collocation

1 code implementation24 Jun 2021 Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.

Model-based Reinforcement Learning reinforcement-learning +1

Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation

1 code implementation24 Jun 2021 C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, Olivier Bachem

We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX.

OpenAI Gym reinforcement-learning +1

Pretrained Transformers as Universal Computation Engines

4 code implementations9 Mar 2021 Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks.

Reset-Free Lifelong Learning with Skill-Space Planning

1 code implementation ICLR 2021 Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space of higher-order skills.

Reinforcement Learning (RL)

Improved Contrastive Divergence Training of Energy Based Models

4 code implementations2 Dec 2020 Yilun Du, Shuang Li, Joshua Tenenbaum, Igor Mordatch

Contrastive divergence is a popular method of training energy-based models, but is known to have difficulties with training stability.

Data Augmentation Image Generation +1

Compositional Visual Generation with Energy Based Models

no code implementations NeurIPS 2020 Yilun Du, Shuang Li, Igor Mordatch

A vital aspect of human intelligence is the ability to compose increasingly complex concepts out of simpler ideas, enabling both rapid learning and adaptation of knowledge.

Energy-Based Models for Continual Learning

1 code implementation24 Nov 2020 Shuang Li, Yilun Du, Gido M. van de Ven, Igor Mordatch

We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems.

Continual Learning

Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation27 Oct 2020 Michael Janner, Igor Mordatch, Sergey Levine

We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.

Generative Adversarial Network

One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control

2 code implementations ICML 2020 Wenlong Huang, Igor Mordatch, Deepak Pathak

We observe that a wide variety of drastically diverse locomotion styles across morphologies as well as centralized coordination emerges via message passing between decentralized modules purely from the reinforcement learning objective.

reinforcement-learning Reinforcement Learning (RL)

A Game Theoretic Framework for Model Based Reinforcement Learning

no code implementations16 Apr 2020 Aravind Rajeswaran, Igor Mordatch, Vikash Kumar

Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data.

Model-based Reinforcement Learning reinforcement-learning +1

Compositional Visual Generation and Inference with Energy Based Models

1 code implementation13 Apr 2020 Yilun Du, Shuang Li, Igor Mordatch

A vital aspect of human intelligence is the ability to compose increasingly complex concepts out of simpler ideas, enabling both rapid learning and adaptation of knowledge.

Neural MMO v1.3: A Massively Multiagent Game Environment for Training and Evaluating Neural Networks

no code implementations31 Jan 2020 Joseph Suarez, Yilun Du, Igor Mordatch, Phillip Isola

We present Neural MMO, a massively multiagent game environment inspired by MMOs and discuss our progress on two more general challenges in multiagent systems engineering for AI research: distributed infrastructure and game IO.

Policy Gradient Methods

Adaptive Online Planning for Continual Lifelong Learning

1 code implementation3 Dec 2019 Kevin Lu, Igor Mordatch, Pieter Abbeel

We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change.

Implicit Generation and Modeling with Energy Based Models

1 code implementation NeurIPS 2019 Yilun Du, Igor Mordatch

Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train.

General Classification Image Generation +2

Emergent Tool Use From Multi-Agent Autocurricula

3 code implementations ICLR 2020 Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, Igor Mordatch

Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination.

reinforcement-learning Reinforcement Learning (RL)

Model Based Planning with Energy Based Models

no code implementations15 Sep 2019 Yilun Du, Toru Lin, Igor Mordatch

We provide an online algorithm to train EBMs while interacting with the environment, and show that EBMs allow for significantly better online learning than corresponding feed-forward networks.

Reinforcement Learning (RL)

Neural MMO: A massively multiplayer game environment for intelligent agents

no code implementations ICLR 2019 Joseph Suarez, Yilun Du, Phillip Isola, Igor Mordatch

We demonstrate how this platform can be used to study behavior and learning in large populations of neural agents.

Implicit Generation and Generalization in Energy-Based Models

3 code implementations20 Mar 2019 Yilun Du, Igor Mordatch

Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train.

General Classification Image Reconstruction +1

Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents

1 code implementation2 Mar 2019 Joseph Suarez, Yilun Du, Phillip Isola, Igor Mordatch

The emergence of complex life on Earth is often attributed to the arms race that ensued from a huge number of organisms all competing for finite resources.

Multi-Agent Reinforcement Learning with Multi-Step Generative Models

no code implementations29 Jan 2019 Orr Krupnik, Igor Mordatch, Aviv Tamar

We consider model-based reinforcement learning (MBRL) in 2-agent, high-fidelity continuous control problems -- an important domain for robots interacting with other agents in the same workspace.

Continuous Control Decision Making +5

Concept Learning with Energy-Based Models

no code implementations6 Nov 2018 Igor Mordatch

Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning.

Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control

no code implementations ICLR 2019 Kendall Lowrey, Aravind Rajeswaran, Sham Kakade, Emanuel Todorov, Igor Mordatch

We study how local trajectory optimization can cope with approximation errors in the value function, and can stabilize and accelerate value function learning.

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

no code implementations ICLR 2018 Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M. Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel

To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP.

Policy Gradient Methods reinforcement-learning +1

Interpretable and Pedagogical Examples

no code implementations ICLR 2018 Smitha Milli, Pieter Abbeel, Igor Mordatch

Teachers intentionally pick the most informative examples to show their students.

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

1 code implementation ICLR 2018 Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch, Pieter Abbeel

Ability to continuously learn and adapt from limited experience in nonstationary environments is an important milestone on the path towards general intelligence.

Meta-Learning

Emergent Complexity via Multi-Agent Competition

2 code implementations ICLR 2018 Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, Igor Mordatch

In this paper, we point out that a competitive multi-agent environment trained with self-play can produce behaviors that are far more complex than the environment itself.

Blocking

Learning with Opponent-Learning Awareness

6 code implementations13 Sep 2017 Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch

We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL.

Multi-agent Reinforcement Learning

Emergence of Grounded Compositional Language in Multi-Agent Populations

1 code implementation15 Mar 2017 Igor Mordatch, Pieter Abbeel

By capturing statistical patterns in large corpora, machine learning has enabled significant advances in natural language processing, including in machine translation, question answering, and sentiment analysis.

Machine Translation Question Answering +2

Prediction and Control with Temporal Segment Models

no code implementations ICML 2017 Nikhil Mishra, Pieter Abbeel, Igor Mordatch

We introduce a method for learning the dynamics of complex nonlinear systems based on deep generative models over temporal segments of states and actions.

A Paradigm for Situated and Goal-Driven Language Learning

no code implementations12 Oct 2016 Jon Gauthier, Igor Mordatch

A distinguishing property of human intelligence is the ability to flexibly use language in order to communicate complex ideas with other humans in a variety of contexts.

Cannot find the paper you are looking for? You can Submit a new open access paper.