1 code implementation • CVPR 2016 • Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko, Trevor Darrell
In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object.
Ranked #12 on Referring Expression Comprehension on Talk2Car
2 code implementations • CVPR 2017 • Huazhe Xu, Yang Gao, Fisher Yu, Trevor Darrell
Robust perception-action models should be learned from training data with diverse visual appearances and realistic behaviors, yet current approaches to deep visuomotor policy learning have been generally limited to in-situ models learned from a single vehicle or a simulation environment.
no code implementations • ICLR 2018 • Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell
We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.
2 code implementations • ICLR 2019 • Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma
Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL.
no code implementations • 8 Nov 2018 • Dennis Lee, Haoran Tang, Jeffrey O. Zhang, Huazhe Xu, Trevor Darrell, Pieter Abbeel
We present a novel modular architecture for StarCraft II AI.
1 code implementation • ICCV 2019 • Hang Gao, Huazhe Xu, Qi-Zhi Cai, Ruth Wang, Fisher Yu, Trevor Darrell
A dynamic scene has two types of elements: those that move fluidly and can be predicted from previous frames, and those which are disoccluded (exposed) and cannot be extrapolated.
1 code implementation • ICLR 2020 • Yuping Luo, Huazhe Xu, Tengyu Ma
Imitation learning, followed by reinforcement learning algorithms, is a promising paradigm to solve complex control tasks sample-efficiently.
no code implementations • 25 Sep 2019 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell
Learning diverse and natural behaviors is one of the longstanding goal for creating intelligent characters in the animated world.
no code implementations • 25 Sep 2019 • Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell
In this paper, we propose Scoring-Aggregating-Planning (SAP), a framework that can learn task-agnostic semantics and dynamics priors from arbitrary quality interactions as well as the corresponding sparse rewards and then plan on unseen tasks in zero-shot condition.
1 code implementation • 17 Oct 2019 • Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell
The agent is first presented with previous experiences in the training environment, along with task description in the form of trajectory-level sparse rewards.
1 code implementation • NeurIPS 2020 • Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang
While training multiple tasks jointly allow the policies to share parameters across different tasks, the optimization problem becomes non-trivial: It remains unclear what parameters in the network should be reused across tasks, and how the gradients from different tasks may interfere with each other.
Ranked #1 on Meta-Learning on MT50
1 code implementation • ICML 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell
In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics.
no code implementations • ECCV 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Xiaolong Wang, Trevor Darrell
Generating diverse and natural human motion is one of the long-standing goals for creating intelligent characters in the animated world.
2 code implementations • 16 Oct 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.
1 code implementation • CVPR 2021 • Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang
Synthesizing 3D human motion plays an important role in many graphics applications as well as understanding human activity.
2 code implementations • 15 Dec 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.
2 code implementations • ICLR 2021 • Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu
We propose a simple, general and effective technique, Reward Randomization for discovering diverse strategic policies in complex multi-agent games.
1 code implementation • ICLR 2021 • Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu
We propose a novel learning paradigm, Self-Imitation via Reduction (SIR), for solving compositional reinforcement learning problems.
1 code implementation • 26 May 2021 • Mike Lambeta, Huazhe Xu, Jingwei Xu, Po-Wei Chou, Shaoxiong Wang, Trevor Darrell, Roberto Calandra
With the increased availability of rich tactile sensors, there is an equally proportional need for open-source and integrated software capable of efficiently and effectively processing raw touch measurements into high-level signals that can be used for control and decision-making.
1 code implementation • 3 Jun 2021 • Huazhe Xu, Yuping Luo, Shaoxiong Wang, Trevor Darrell, Roberto Calandra
The virtuoso plays the piano with passion, poetry and extraordinary technical ability.
no code implementations • 10 Jun 2021 • Minghao Zhang, Pingcheng Jian, Yi Wu, Huazhe Xu, Xiaolong Wang
We address the problem of safely solving complex bimanual robot manipulation tasks with sparse rewards.
1 code implementation • ICLR 2022 • Ruihan Yang, Minghao Zhang, Nicklas Hansen, Huazhe Xu, Xiaolong Wang
Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead.
1 code implementation • 22 Nov 2021 • Ling Pan, Longbo Huang, Tengyu Ma, Huazhe Xu
Conservatism has led to significant progress in offline reinforcement learning (RL) where an agent learns from pre-collected datasets.
1 code implementation • NeurIPS 2021 • Jiashun Wang, Huazhe Xu, Medhini Narasimhan, Xiaolong Wang
Thus, instead of predicting each human pose trajectory in isolation, we introduce a Multi-Range Transformers model which contains of a local-range encoder for individual motion and a global-range encoder for social interactions.
1 code implementation • NeurIPS 2021 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian
We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.
1 code implementation • 21 Feb 2022 • Zhecheng Yuan, Guozheng Ma, Yao Mu, Bo Xia, Bo Yuan, Xueqian Wang, Ping Luo, Huazhe Xu
One of the key challenges in visual Reinforcement Learning (RL) is to learn policies that can generalize to unseen environments.
no code implementations • 5 May 2022 • Haochen Shi, Huazhe Xu, Zhiao Huang, Yunzhu Li, Jiajun Wu
Our learned model-based planning framework is comparable to and sometimes better than human subjects on the tested tasks.
no code implementations • 24 Jun 2022 • Yunfei Li, Tian Gao, Jiaqi Yang, Huazhe Xu, Yi Wu
It has been a recent trend to leverage the power of supervised learning (SL) towards more effective reinforcement learning (RL) methods.
no code implementations • 28 Sep 2022 • Zhengrong Xue, Zhecheng Yuan, Jiashun Wang, Xueqian Wang, Yang Gao, Huazhe Xu
Can a robot manipulate intra-category unseen objects in arbitrary poses with the help of a mere demonstration of grasping pose on a single object instance?
no code implementations • 4 Oct 2022 • Ray Chen Zheng, Kaizhe Hu, Zhecheng Yuan, Boyuan Chen, Huazhe Xu
To tackle this problem, we introduce Extraneousness-Aware Imitation Learning (EIL), a self-supervised approach that learns visuomotor policies from third-person demonstrations with extraneous subsequences.
no code implementations • 18 Oct 2022 • Pu Hua, Yubei Chen, Huazhe Xu
The low-level sensory and motor signals in deep reinforcement learning, which exist in high-dimensional spaces such as image observations or motor torques, are inherently challenging to understand or utilize directly for downstream tasks.
no code implementations • 24 Oct 2022 • Linfeng Zhao, Huazhe Xu, Lawson L. S. Wong
To alleviate this issue, we propose to differentiate through the Bellman fixed-point equation to decouple forward and backward passes for Value Iteration Network and its variants, which enables constant backward cost (in planning horizon) and flexible forward budget and helps scale up to large tasks.
no code implementations • 5 Dec 2022 • Can Chang, Ni Mu, Jiajun Wu, Ling Pan, Huazhe Xu
Specifically, we introduce Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance(E-MAPP), a novel framework that leverages parallel programs to guide multiple agents to efficiently accomplish goals that require planning over $10+$ stages.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 7 Dec 2022 • Hao Li, Yizhi Zhang, Junzhe Zhu, Shaoxiong Wang, Michelle A Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao, Jiajun Wu
Humans use all of their senses to accomplish different tasks in everyday activities.
1 code implementation • 12 Dec 2022 • Nicklas Hansen, Zhecheng Yuan, Yanjie Ze, Tongzhou Mu, Aravind Rajeswaran, Hao Su, Huazhe Xu, Xiaolong Wang
In this paper, we examine the effectiveness of pre-training for visuo-motor control tasks.
no code implementations • 17 Dec 2022 • Zhecheng Yuan, Zhengrong Xue, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, Huazhe Xu
Hence, we propose Pre-trained Image Encoder for Generalizable visual reinforcement learning (PIE-G), a simple yet effective framework that can generalize to the unseen visual scenarios in a zero-shot manner.
no code implementations • 4 Jan 2023 • Sifan Ye, Yixing Wang, Jiaman Li, Dennis Park, C. Karen Liu, Huazhe Xu, Jiajun Wu
Large-scale capture of human motion with diverse, complex scenes, while immensely useful, is often considered prohibitively costly.
Ranked #3 on 3D Semantic Scene Completion on PRO-teXt
2D Semantic Segmentation task 1 (8 classes) 3D Semantic Scene Completion +1
no code implementations • 2 Feb 2023 • Ruijie Zheng, Xiyao Wang, Huazhe Xu, Furong Huang
To test this hypothesis, we devise two practical robust training mechanisms through computing the adversarial noise and regularizing the value network's spectral norm to directly regularize the Lipschitz condition of the value functions.
1 code implementation • 3 Mar 2023 • Kaizhe Hu, Ray Chen Zheng, Yang Gao, Huazhe Xu
Typical RL methods usually require considerable online interaction data that are costly and unsafe to collect in the real world.
1 code implementation • IEEE International Conference on Robotics and Automation (ICRA) 2023 • Yunfei Li;, Chaoyi Pan, Huazhe Xu, Xiaolong Wang, Yi Wu
We develop a symmetry-aware actor-critic framework that leverages the interchangeable roles of the two manipulators in the bimanual control setting to reduce the policy search space.
no code implementations • 5 Jun 2023 • Tianying Ji, Yu Luo, Fuchun Sun, Xianyuan Zhan, Jianwei Zhang, Huazhe Xu
Learning high-quality Q-value functions plays a key role in the success of many modern off-policy deep reinforcement learning (RL) algorithms.
1 code implementation • 22 Jun 2023 • Ruijie Zheng, Xiyao Wang, Yanchao Sun, Shuang Ma, Jieyu Zhao, Huazhe Xu, Hal Daumé III, Furong Huang
Despite recent progress in reinforcement learning (RL) from raw pixel data, sample inefficiency continues to present a substantial obstacle.
no code implementations • 29 Jun 2023 • Zhengrong Xue, Han Zhang, Jingwen Cheng, Zhengmao He, Yuanchen Ju, Changyi Lin, Gu Zhang, Huazhe Xu
We present ArrayBot, a distributed manipulation system consisting of a $16 \times 16$ array of vertically sliding pillars integrated with tactile sensors, which can simultaneously support, perceive, and manipulate the tabletop objects.
1 code implementation • NeurIPS 2023 • Zhecheng Yuan, Sizhe Yang, Pu Hua, Can Chang, Kaizhe Hu, Huazhe Xu
Visual Reinforcement Learning (Visual RL), coupled with high-dimensional observations, has consistently confronted the long-standing challenge of out-of-distribution generalization.
1 code implementation • 12 Sep 2023 • Yuerong Li, Zhengrong Xue, Huazhe Xu
In this paper, we explore the merits of local features by proposing the unsupervised framework of Object-centric Temporal Action Segmentation (OTAS).
1 code implementation • 2 Oct 2023 • Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang
Collecting large amounts of real-world interaction data to train general robotic policies is often prohibitively expensive, thus motivating the use of simulation data.
no code implementations • 11 Oct 2023 • Xiyao Wang, Ruijie Zheng, Yanchao Sun, Ruonan Jia, Wichayaporn Wongkamjan, Huazhe Xu, Furong Huang
In this paper, we propose $\texttt{COPlanner}$, a planning-driven framework for model-based methods to address the inaccurately learned dynamics model problem with conservative model rollouts and optimistic environment exploration.
2 code implementations • 30 Oct 2023 • Guowei Xu, Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Zhecheng Yuan, Tianying Ji, Yu Luo, Xiaoyu Liu, Jiaxin Yuan, Pu Hua, Shuzhen Li, Yanjie Ze, Hal Daumé III, Furong Huang, Huazhe Xu
To quantify this inactivity, we adopt dormant ratio as a metric to measure inactivity in the RL agent's network.
1 code implementation • 31 Oct 2023 • Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon S. Du, Huazhe Xu
Offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected datasets.
no code implementations • 6 Nov 2023 • Kun Lei, Zhengmao He, Chenhao Lu, Kaizhe Hu, Yang Gao, Huazhe Xu
Owning to the alignment of objectives in two phases, the RL agent can transfer between offline and online learning seamlessly.
no code implementations • 21 Dec 2023 • Tao Huang, Guangqi Jiang, Yanjie Ze, Huazhe Xu
Learning rewards from expert videos offers an affordable and effective solution to specify the intended behaviors for reinforcement learning tasks.
1 code implementation • 28 Dec 2023 • Ziyu Wang, Yanjie Ze, Yifei Sun, Zhecheng Yuan, Huazhe Xu
Learning policies that can generalize to unseen environments is a fundamental challenge in visual reinforcement learning (RL).
no code implementations • 15 Jan 2024 • Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu
The next step is to map the contact points of the retrieved objects to the new object.
2 code implementations • 24 Jan 2024 • Suning Huang, Boyuan Chen, Huazhe Xu, Vincent Sitzmann
Inspired by nature and recent novel robot designs, we propose to go a step further and explore the novel reconfigurable robots, defined as robots that can change their morphology within their lifetime.
1 code implementation • 9 Feb 2024 • Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Shuang Ma, Hal Daumé III, Huazhe Xu, John Langford, Praveen Palanisamy, Kalyan Shankar Basu, Furong Huang
We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks.
no code implementations • 22 Feb 2024 • Junting Chen, Yao Mu, Qiaojun Yu, Tianming Wei, Silang Wu, Zhecheng Yuan, Zhixuan Liang, Chao Yang, Kaipeng Zhang, Wenqi Shao, Yu Qiao, Huazhe Xu, Mingyu Ding, Ping Luo
To bridge this ``ideal-to-real'' gap, this paper presents \textbf{RobotScript}, a platform for 1) a deployable robot manipulation pipeline powered by code generation; and 2) a code generation benchmark for robot manipulation tasks in free-form natural language.
no code implementations • 22 Feb 2024 • Tianying Ji, Yongyuan Liang, Yan Zeng, Yu Luo, Guowei Xu, Jiawei Guo, Ruijie Zheng, Furong Huang, Fuchun Sun, Huazhe Xu
The varying significance of distinct primitive behaviors during the policy learning process has been overlooked by prior model-free RL algorithms.
1 code implementation • 6 Mar 2024 • Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, Huazhe Xu
Imitation learning provides an efficient way to teach robots dexterous skills; however, learning complex skills robustly and generalizablely usually consumes large amounts of human demonstrations.
no code implementations • 28 Mar 2024 • Chongkai Gao, Zhengrong Xue, Shuying Deng, Tianhai Liang, Siqi Yang, Lin Shao, Huazhe Xu
RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, generalizes to unseen SE(3) transformations and instances of target objects, resists visual interference of distracting objects, and follows the near real-time pose change of the target object.
no code implementations • 29 Mar 2024 • Zhengmao He, Kun Lei, Yanjie Ze, Koushil Sreenath, Zhongyu Li, Huazhe Xu
Our approach is validated through simulations and real-world experiments, demonstrating the robot's ability to perform tasks that demand mobility and high precision, such as lifting a basket from the ground while moving, closing a dishwasher, pressing a button, and pushing a door.