Search Results for author: Seyed Kamyar Seyed Ghasemipour

Found 12 papers, 5 papers with code

Understanding the Relation Between Maximum-Entropy Inverse Reinforcement Learning and Behaviour Cloning

no code implementations • ICLR Workshop DeepGenStruct 2019 • Seyed Kamyar Seyed Ghasemipour, Shane Gu, Richard Zemel

$f$-MAX provides grounds for more directly comparing the objectives for LfD.

Continuous Control Decision Making +3

Paper
Add Code

A Divergence Minimization Perspective on Imitation Learning Methods

3 code implementations • 6 Nov 2019 • Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.

Behavioural cloning Continuous Control

386

Paper
Code

SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies

1 code implementation • NeurIPS 2019 • Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Richard Zemel

We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC.

Continuous Control Few-Shot Learning +3

Paper
Code

Acme: A Research Framework for Distributed Reinforcement Learning

5 code implementations • 1 Jun 2020 • Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Abe Friesen, Ruba Haroun, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Andrew Cowie, Ziyu Wang, Bilal Piot, Nando de Freitas

These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research.

DQN Replay Dataset reinforcement-learning +1

3,372

Paper
Code

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

no code implementations • 21 Jul 2020 • Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Shane Gu

In this work, we closely investigate an important simplification of BCQ -- a prior approach for offline RL -- which removes a heuristic design choice and naturally restricts extracted policies to remain exactly within the support of a given behavior policy.

D4RL Decision Making +2

Paper
Add Code

Why so pessimistic? Estimating uncertainties for offline RL through ensembles, and why their independence matters.

no code implementations • 29 Sep 2021 • Seyed Kamyar Seyed Ghasemipour, Shixiang Shane Gu, Ofir Nachum

Motivated by the success of MSG, we investigate whether efficient approximations to ensembles can be as effective.

Continuous Control D4RL +3

Paper
Add Code

Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization

no code implementations • 10 Oct 2021 • Shixiang Shane Gu, Manfred Diaz, Daniel C. Freeman, Hiroki Furuta, Seyed Kamyar Seyed Ghasemipour, Anton Raichuk, Byron David, Erik Frey, Erwin Coumans, Olivier Bachem

While reward maximization is at the core of RL, reward engineering is not the only -- sometimes nor the easiest -- way for specifying complex behaviors.

Continuous Control Efficient Exploration +1

Paper
Add Code

Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

no code implementations • 15 Mar 2022 • Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch

Despite the simplicity of this objective, the compositional nature of building diverse blueprints from a set of blocks leads to an explosion of complexity in structures that agents encounter.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement Learning

no code implementations • 15 Mar 2022 • Satoshi Kataoka, Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Igor Mordatch

Most successes in robotic manipulation have been restricted to single-arm robots, which limits the range of solvable tasks to pick-and-place, insertion, and objects rearrangement.

Collision Avoidance reinforcement-learning +1

Paper
Add Code

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

4 code implementations • 23 May 2022 • Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi

We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding.

Ranked #17 on Text-to-Image Generation on MS COCO (using extra training data)

7,782

Paper
Code

Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters

2 code implementations • 27 May 2022 • Seyed Kamyar Seyed Ghasemipour, Shixiang Shane Gu, Ofir Nachum

Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of $Q$-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL).

D4RL Offline RL +1

32,819

Paper
Code

Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning

no code implementations • 27 Mar 2023 • Satoshi Kataoka, Youngseog Chung, Seyed Kamyar Seyed Ghasemipour, Pannag Sanketi, Shixiang Shane Gu, Igor Mordatch

Without manually-designed controller nor human demonstrations, we demonstrate that with careful Sim2Real considerations, our policies trained with RL in simulation enable two xArm6 robots to solve the U-shape assembly task with a success rate of above90% in simulation, and 50% on real hardware without any additional real-world fine-tuning.

Collision Avoidance reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.