no code implementations • 27 Oct 2022 • Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine
One of the main observations we make in this work is that, with a suitable representation learning and domain generalization approach, it can be significantly easier for the reward function to generalize to a new but structurally similar task (e. g., inserting a new type of connector) than for the policy.
no code implementations • 12 Oct 2022 • Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Gengchen Yan, Sergey Levine
The utilization of broad datasets has proven to be crucial for generalization for a wide range of fields.
no code implementations • 17 May 2022 • Kuan Fang, Patrick Yin, Ashvin Nair, Sergey Levine
Our experimental results show that PTP can generate feasible sequences of subgoals that enable the policy to efficiently solve the target tasks.
no code implementations • 27 Apr 2022 • Philippe Hansen-Estruch, Amy Zhang, Ashvin Nair, Patrick Yin, Sergey Levine
We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in simulation manipulation tasks.
9 code implementations • 12 Oct 2021 • Ilya Kostrikov, Ashvin Nair, Sergey Levine
The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.
1 code implementation • ICLR 2022 • Ilya Kostrikov, Ashvin Nair, Sergey Levine
The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.
1 code implementation • 8 Jul 2021 • Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine
If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test time.
1 code implementation • 1 Jun 2021 • Alexander Khazatsky, Ashvin Nair, Daniel Jing, Sergey Levine
In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model, attempt to reach them, and thereby update both its skills and its outcome model.
no code implementations • 23 Apr 2021 • Soroush Nasiriany, Vitchyr H. Pong, Ashvin Nair, Alexander Khazatsky, Glen Berseth, Sergey Levine
Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity.
3 code implementations • 16 Jun 2020 • Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine
If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.
no code implementations • 29 Apr 2020 • Gerrit Schoettler, Ashvin Nair, Juan Aparicio Ojea, Sergey Levine, Eugen Solowjow
Robotic insertion tasks are characterized by contact and friction mechanics, making them challenging for conventional feedback control methods due to unmodeled physical effects.
1 code implementation • 23 Oct 2019 • Ashvin Nair, Shikhar Bahl, Alexander Khazatsky, Vitchyr Pong, Glen Berseth, Sergey Levine
When the robot's environment and available objects vary, as they do in most open-world settings, the robot must propose to itself only those goals that it can accomplish in its present setting with the objects that are at hand.
1 code implementation • 13 Jun 2019 • Gerrit Schoettler, Ashvin Nair, Jianlan Luo, Shikhar Bahl, Juan Aparicio Ojea, Eugen Solowjow, Sergey Levine
Connector insertion and many other tasks commonly found in modern manufacturing settings involve complex contact dynamics and friction.
1 code implementation • ICML 2020 • Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine
Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills.
no code implementations • 7 Dec 2018 • Tobias Johannink, Shikhar Bahl, Ashvin Nair, Jianlan Luo, Avinash Kumar, Matthias Loskyll, Juan Aparicio Ojea, Eugen Solowjow, Sergey Levine
In this paper, we study how we can solve difficult control problems in the real world by decomposing them into a part that is solved efficiently by conventional feedback control methods, and the residual which is solved with RL.
2 code implementations • NeurIPS 2018 • Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine
For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires.
3 code implementations • 28 Sep 2017 • Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel
Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL).
no code implementations • 6 Mar 2017 • Ashvin Nair, Dian Chen, Pulkit Agrawal, Phillip Isola, Pieter Abbeel, Jitendra Malik, Sergey Levine
Manipulation of deformable objects, such as ropes and cloth, is an important but challenging problem in robotics.
1 code implementation • NeurIPS 2016 • Pulkit Agrawal, Ashvin Nair, Pieter Abbeel, Jitendra Malik, Sergey Levine
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics.