no code implementations • NAACL 2021 • Eric Wallace, Tony Z. Zhao, Shi Feng, Sameer Singh
In this work, we develop a new data poisoning attack that allows an adversary to control model predictions whenever a desired trigger phrase is present in the input.
1 code implementation • 26 Oct 2020 • Tony Z. Zhao, Anusha Nagabandi, Kate Rakelly, Chelsea Finn, Sergey Levine
Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.
5 code implementations • 19 Feb 2021 • Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
We show that this type of few-shot learning can be unstable: the choice of prompt format, training examples, and even the order of the training examples can cause accuracy to vary from near chance to near state-of-the-art.
no code implementations • 22 Apr 2021 • Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
no code implementations • 23 Apr 2023 • Tony Z. Zhao, Vikash Kumar, Sergey Levine, Chelsea Finn
Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback.
no code implementations • 26 Jul 2023 • Lucy Xiaoyang Shi, Archit Sharma, Tony Z. Zhao, Chelsea Finn
AWE can be combined with any BC algorithm, and we find that AWE can increase the success rate of state-of-the-art algorithms by up to 25% in simulation and by 4-28% on real-world bimanual manipulation tasks, reducing the decision making horizon by up to a factor of 10.
no code implementations • 4 Jan 2024 • Zipeng Fu, Tony Z. Zhao, Chelsea Finn
We first present Mobile ALOHA, a low-cost and whole-body teleoperation system for data collection.
no code implementations • 19 Mar 2024 • Lucy Xiaoyang Shi, Zheyuan Hu, Tony Z. Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, Chelsea Finn
In this paper, we make the following observation: high-level policies that index into sufficiently rich and expressive low-level language-conditioned skills can be readily supervised with human feedback in the form of language corrections.