no code implementations • 14 Dec 2023 • Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald, Luyu Wang, Lei Zhang
Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning.
no code implementations • NeurIPS 2023 • Shalev Lifshitz, Keiran Paster, Harris Chan, Jimmy Ba, Sheila Mcilraith
Constructing AI models that respond to text instructions is challenging, especially for sequential decision-making tasks.
no code implementations • 21 Nov 2022 • Ted Xiao, Harris Chan, Pierre Sermanet, Ayzaan Wahid, Anthony Brohan, Karol Hausman, Sergey Levine, Jonathan Tompson
To accomplish this, we introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL): we utilize semi-supervised language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data and then train language-conditioned policies on the augmented datasets.
2 code implementations • 3 Nov 2022 • Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers.
no code implementations • 12 Jul 2022 • Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter
We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction.
1 code implementation • NeurIPS 2021 • Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jimmy Ba
These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Harris Chan, Jamie Kiros, William Chan
MGLM is a generative joint distribution model over channels.
2 code implementations • ICML 2020 • Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba
What goals should a multi-goal reinforcement learning agent pursue during training in long-horizon tasks?
2 code implementations • ICLR 2020 • Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba
When defining distances, the triangle inequality has proven to be a useful constraint, both theoretically--to prove convergence and optimality guarantees--and empirically--as an inductive bias.
no code implementations • 25 Sep 2019 • Harris Chan, Jamie Kiros, William Chan
For conditional generation, the model is given a fully observed channel, and generates the k-1 channels in parallel.
no code implementations • ICLR 2019 • Yuhuai Wu, Harris Chan, Jamie Kiros, Sanja Fidler, Jimmy Ba
Sparse reward is one of the most challenging problems in reinforcement learning (RL).
no code implementations • 21 Feb 2019 • Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba
We demonstrate that the learning performance of our method is more accurately captured by the structure of the covariance matrix of the noise rather than by the variance of gradients.
no code implementations • 12 Feb 2019 • Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba
We first analyze the differences among goal representation, and show that ACTRCE can efficiently solve difficult reinforcement learning problems in challenging 3D navigation tasks, whereas HER with non-language goal representation failed to learn.
no code implementations • 27 Sep 2018 • Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba
Unfortunately, a major drawback is the so-called generalization gap: large-batch training typically leads to a degradation in generalization performance of the model as compared to small-batch training.
no code implementations • 7 Sep 2018 • Harris Chan, Atef Chaudhury, Kevin Shen
Classification systems typically act in isolation, meaning they are required to implicitly memorize the characteristics of all candidate classes in order to classify.