Search Results for author: Seyed Kamyar Seyed Ghasemipour

Found 11 papers, 5 papers with code

Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters

1 code implementation27 May 2022 Seyed Kamyar Seyed Ghasemipour, Shixiang Shane Gu, Ofir Nachum

Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of $Q$-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL).

D4RL Offline RL

Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

no code implementations15 Mar 2022 Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch

Despite the simplicity of this objective, the compositional nature of building diverse blueprints from a set of blocks leads to an explosion of complexity in structures that agents encounter.


Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement Learning

no code implementations15 Mar 2022 Satoshi Kataoka, Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Igor Mordatch

Most successes in robotic manipulation have been restricted to single-arm robots, which limits the range of solvable tasks to pick-and-place, insertion, and objects rearrangement.


EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

no code implementations21 Jul 2020 Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Shane Gu

In this work, we closely investigate an important simplification of BCQ -- a prior approach for offline RL -- which removes a heuristic design choice and naturally restricts extracted policies to remain exactly within the support of a given behavior policy.

D4RL Decision Making +2

SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies

1 code implementation NeurIPS 2019 Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Richard Zemel

We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC.

Continuous Control Few-Shot Learning +2

A Divergence Minimization Perspective on Imitation Learning Methods

2 code implementations6 Nov 2019 Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.

Behavioural cloning Continuous Control

Cannot find the paper you are looking for? You can Submit a new open access paper.