no code implementations • ICML 2020 • Aravind Rajeswaran, Igor Mordatch, Vikash Kumar
We point out that a large class of MBRL algorithms can be viewed as a game between two players: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player.
no code implementations • 23 Apr 2022 • Yuchen Cui, Scott Niekum, Abhinav Gupta, Vikash Kumar, Aravind Rajeswaran
Task specification is at the core of programming autonomous robots.
1 code implementation • 23 Mar 2022 • Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta
We study how visual representations pre-trained on diverse human video data can enable data-efficient learning of downstream robotic manipulation tasks.
no code implementations • 10 Mar 2022 • Allan Zhou, Vikash Kumar, Chelsea Finn, Aravind Rajeswaran
Many tasks in control, robotics, and planning can be specified using desired goal configurations for various entities in the environment.
no code implementations • 29 Sep 2021 • Tanmay Shankar, Yixin Lin, Aravind Rajeswaran, Vikash Kumar, Stuart Anderson, Jean Oh
In this paper, we explore how we can endow robots with the ability to learn correspondences between their own skills, and those of morphologically different robots in different domains, in an entirely unsupervised manner.
no code implementations • 28 Jul 2021 • Vikash Kumar
However, getting these results in the early stages of system development is an essential prerequisite for the system's dimensioning and configuration of the hardware setup.
no code implementations • 22 Apr 2021 • Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
1 code implementation • ICLR 2020 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
no code implementations • ICLR 2020 • Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning.
no code implementations • 27 Apr 2020 • Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.
2 code implementations • 27 Apr 2020 • Archit Sharma, Michael Ahn, Sergey Levine, Vikash Kumar, Karol Hausman, Shixiang Gu
Can we instead develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks?
no code implementations • 16 Apr 2020 • Aravind Rajeswaran, Igor Mordatch, Vikash Kumar
Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data.
no code implementations • 9 Jan 2020 • Silvia Cruciani, Balakumar Sundaralingam, Kaiyu Hang, Vikash Kumar, Tucker Hermans, Danica Kragic
The purpose of this benchmark is to evaluate the planning and control aspects of robotic in-hand manipulation systems.
Robotics
1 code implementation • 25 Oct 2019 • Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman
We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.
2 code implementations • 25 Sep 2019 • Anusha Nagabandi, Kurt Konoglie, Sergey Levine, Vikash Kumar
Dexterous multi-fingered hands can provide robots with the ability to flexibly perform a wide range of manipulation skills.
1 code implementation • 25 Sep 2019 • Michael Ahn, Henry Zhu, Kristian Hartikainen, Hugo Ponte, Abhishek Gupta, Sergey Levine, Vikash Kumar
ROBEL introduces two robots, each aimed to accelerate reinforcement learning research in different task domains: D'Claw is a three-fingered hand robot that facilitates learning dexterous manipulation tasks, and D'Kitty is a four-legged robot that facilitates learning agile legged locomotion tasks.
no code implementations • 13 Aug 2019 • Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, Vikash Kumar
Our method hinges on the use of hierarchical sim2real -- a simulated environment is used to learn low-level goal-reaching skills, which are then used as the action space for a high-level RL controller, also trained in simulation.
3 code implementations • 2 Jul 2019 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
no code implementations • 5 Mar 2019 • Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet
Learning from play (LfP) offers three main advantages: 1) It is cheap.
Robotics
42 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
no code implementations • 14 Oct 2018 • Henry Zhu, Abhishek Gupta, Aravind Rajeswaran, Sergey Levine, Vikash Kumar
Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators.
no code implementations • 2 Oct 2018 • Suraj Nair, Mohammad Babaeizadeh, Chelsea Finn, Sergey Levine, Vikash Kumar
We test our method on the domain of assembly, specifically the mating of tetris-style block pairs.
no code implementations • ICLR 2018 • Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M. Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel
To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP.
30 code implementations • 26 Feb 2018 • Matthias Plappert, Marcin Andrychowicz, Alex Ray, Bob McGrew, Bowen Baker, Glenn Powell, Jonas Schneider, Josh Tobin, Maciek Chociej, Peter Welinder, Vikash Kumar, Wojciech Zaremba
The purpose of this technical report is two-fold.
1 code implementation • ICLR 2018 • Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine
In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.
no code implementations • 17 Oct 2017 • Joshua Tobin, Lukas Biewald, Rocky Duan, Marcin Andrychowicz, Ankur Handa, Vikash Kumar, Bob McGrew, Jonas Schneider, Peter Welinder, Wojciech Zaremba, Pieter Abbeel
In this work, we explore a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis.
no code implementations • 28 Sep 2017 • Aravind Rajeswaran, Vikash Kumar, Abhishek Gupta, Giulia Vezzani, John Schulman, Emanuel Todorov, Sergey Levine
Furthermore, deployment of DRL on physical systems remains challenging due to sample inefficiency.
no code implementations • 15 Nov 2016 • Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine
We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.