no code implementations • 3 Mar 2023 • Kaizhe Hu, Ray Chen Zheng, Yang Gao, Huazhe Xu
Typical RL methods usually require considerable online interaction data that are costly and unsafe to collect in the real world.
no code implementations • 2 Feb 2023 • Ruijie Zheng, Xiyao Wang, Huazhe Xu, Furong Huang
To test this hypothesis, we devise two practical robust training mechanisms through computing the adversarial noise and regularizing the value network's spectral norm to directly regularize the Lipschitz condition of the value functions.
no code implementations • 4 Jan 2023 • Sifan Ye, Yixing Wang, Jiaman Li, Dennis Park, C. Karen Liu, Huazhe Xu, Jiajun Wu
Large-scale capture of human motion with diverse, complex scenes, while immensely useful, is often considered prohibitively costly.
no code implementations • 17 Dec 2022 • Zhecheng Yuan, Zhengrong Xue, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, Huazhe Xu
Hence, we propose Pre-trained Image Encoder for Generalizable visual reinforcement learning (PIE-G), a simple yet effective framework that can generalize to the unseen visual scenarios in a zero-shot manner.
no code implementations • 12 Dec 2022 • Nicklas Hansen, Zhecheng Yuan, Yanjie Ze, Tongzhou Mu, Aravind Rajeswaran, Hao Su, Huazhe Xu, Xiaolong Wang
We revisit a simple Learning-from-Scratch baseline for visuo-motor control that uses data augmentation and a shallow ConvNet.
no code implementations • 7 Dec 2022 • Hao Li, Yizhi Zhang, Junzhe Zhu, Shaoxiong Wang, Michelle A Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao, Jiajun Wu
Humans use all of their senses to accomplish different tasks in everyday activities.
no code implementations • 5 Dec 2022 • Can Chang, Ni Mu, Jiajun Wu, Ling Pan, Huazhe Xu
Specifically, we introduce Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance(E-MAPP), a novel framework that leverages parallel programs to guide multiple agents to efficiently accomplish goals that require planning over $10+$ stages.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 24 Oct 2022 • Linfeng Zhao, Huazhe Xu, Lawson L. S. Wong
To alleviate this issue, we propose to differentiate through the Bellman fixed-point equation to decouple forward and backward passes for Value Iteration Network and its variants, which enables constant backward cost (in planning horizon) and flexible forward budget and helps scale up to large tasks.
no code implementations • 18 Oct 2022 • Pu Hua, Yubei Chen, Huazhe Xu
The low-level sensory and motor signals in deep reinforcement learning, which exist in high-dimensional spaces such as image observations or motor torques, are inherently challenging to understand or utilize directly for downstream tasks.
no code implementations • 4 Oct 2022 • Ray Chen Zheng, Kaizhe Hu, Zhecheng Yuan, Boyuan Chen, Huazhe Xu
To tackle this problem, we introduce Extraneousness-Aware Imitation Learning (EIL), a self-supervised approach that learns visuomotor policies from third-person demonstrations with extraneous subsequences.
no code implementations • 28 Sep 2022 • Zhengrong Xue, Zhecheng Yuan, Jiashun Wang, Xueqian Wang, Yang Gao, Huazhe Xu
Can a robot manipulate intra-category unseen objects in arbitrary poses with the help of a mere demonstration of grasping pose on a single object instance?
no code implementations • 24 Jun 2022 • Yunfei Li, Tian Gao, Jiaqi Yang, Huazhe Xu, Yi Wu
It has been a recent trend to leverage the power of supervised learning (SL) towards more effective reinforcement learning (RL) methods.
no code implementations • 5 May 2022 • Haochen Shi, Huazhe Xu, Zhiao Huang, Yunzhu Li, Jiajun Wu
Our learned model-based planning framework is comparable to and sometimes better than human subjects on the tested tasks.
no code implementations • 21 Feb 2022 • Zhecheng Yuan, Guozheng Ma, Yao Mu, Bo Xia, Bo Yuan, Xueqian Wang, Ping Luo, Huazhe Xu
One of the key challenges in visual Reinforcement Learning (RL) is to learn policies that can generalize to unseen environments.
1 code implementation • NeurIPS 2021 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.
1 code implementation • NeurIPS 2021 • Jiashun Wang, Huazhe Xu, Medhini Narasimhan, Xiaolong Wang
Thus, instead of predicting each human pose trajectory in isolation, we introduce a Multi-Range Transformers model which contains of a local-range encoder for individual motion and a global-range encoder for social interactions.
no code implementations • 22 Nov 2021 • Ling Pan, Longbo Huang, Tengyu Ma, Huazhe Xu
Conservatism has led to significant progress in offline reinforcement learning (RL) where an agent learns from pre-collected datasets.
1 code implementation • ICLR 2022 • Ruihan Yang, Minghao Zhang, Nicklas Hansen, Huazhe Xu, Xiaolong Wang
Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead.
no code implementations • 10 Jun 2021 • Minghao Zhang, Pingcheng Jian, Yi Wu, Huazhe Xu, Xiaolong Wang
We address the problem of safely solving complex bimanual robot manipulation tasks with sparse rewards.
1 code implementation • 3 Jun 2021 • Huazhe Xu, Yuping Luo, Shaoxiong Wang, Trevor Darrell, Roberto Calandra
The virtuoso plays the piano with passion, poetry and extraordinary technical ability.
1 code implementation • 26 May 2021 • Mike Lambeta, Huazhe Xu, Jingwei Xu, Po-Wei Chou, Shaoxiong Wang, Trevor Darrell, Roberto Calandra
With the increased availability of rich tactile sensors, there is an equally proportional need for open-source and integrated software capable of efficiently and effectively processing raw touch measurements into high-level signals that can be used for control and decision-making.
1 code implementation • ICLR 2021 • Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu
We propose a novel learning paradigm, Self-Imitation via Reduction (SIR), for solving compositional reinforcement learning problems.
2 code implementations • ICLR 2021 • Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu
We propose a simple, general and effective technique, Reward Randomization for discovering diverse strategic policies in complex multi-agent games.
2 code implementations • 15 Dec 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.
1 code implementation • CVPR 2021 • Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang
Synthesizing 3D human motion plays an important role in many graphics applications as well as understanding human activity.
2 code implementations • 16 Oct 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.
no code implementations • ECCV 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Xiaolong Wang, Trevor Darrell
Generating diverse and natural human motion is one of the long-standing goals for creating intelligent characters in the animated world.
1 code implementation • ICML 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell
In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics.
1 code implementation • NeurIPS 2020 • Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang
While training multiple tasks jointly allow the policies to share parameters across different tasks, the optimization problem becomes non-trivial: It remains unclear what parameters in the network should be reused across tasks, and how the gradients from different tasks may interfere with each other.
Ranked #1 on Meta-Learning on MT50
1 code implementation • 17 Oct 2019 • Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell
The agent is first presented with previous experiences in the training environment, along with task description in the form of trajectory-level sparse rewards.
no code implementations • 25 Sep 2019 • Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell
In this paper, we propose Scoring-Aggregating-Planning (SAP), a framework that can learn task-agnostic semantics and dynamics priors from arbitrary quality interactions as well as the corresponding sparse rewards and then plan on unseen tasks in zero-shot condition.
no code implementations • 25 Sep 2019 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell
Learning diverse and natural behaviors is one of the longstanding goal for creating intelligent characters in the animated world.
1 code implementation • ICLR 2020 • Yuping Luo, Huazhe Xu, Tengyu Ma
Imitation learning, followed by reinforcement learning algorithms, is a promising paradigm to solve complex control tasks sample-efficiently.
1 code implementation • ICCV 2019 • Hang Gao, Huazhe Xu, Qi-Zhi Cai, Ruth Wang, Fisher Yu, Trevor Darrell
A dynamic scene has two types of elements: those that move fluidly and can be predicted from previous frames, and those which are disoccluded (exposed) and cannot be extrapolated.
no code implementations • 8 Nov 2018 • Dennis Lee, Haoran Tang, Jeffrey O. Zhang, Huazhe Xu, Trevor Darrell, Pieter Abbeel
We present a novel modular architecture for StarCraft II AI.
2 code implementations • ICLR 2019 • Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma
Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL.
no code implementations • ICLR 2018 • Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell
We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.
2 code implementations • CVPR 2017 • Huazhe Xu, Yang Gao, Fisher Yu, Trevor Darrell
Robust perception-action models should be learned from training data with diverse visual appearances and realistic behaviors, yet current approaches to deep visuomotor policy learning have been generally limited to in-situ models learned from a single vehicle or a simulation environment.
1 code implementation • CVPR 2016 • Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko, Trevor Darrell
In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object.
Ranked #11 on Referring Expression Comprehension on Talk2Car