Search Results for author: Yuke Zhu

Found 69 papers, 31 papers with code

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition

1 code implementation ECCV 2020 Linxi Fan, Shyamal Buch, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei

We analyze the suitability of our new primitive for video action recognition and explore several novel variations of our approach to enable stronger representational flexibility while maintaining an efficient design.

Action Recognition Temporal Action Localization +1

Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment

no code implementations15 Nov 2022 Huihan Liu, Soroush Nasiriany, Lance Zhang, Zhiyao Bao, Yuke Zhu

To harness the capabilities of state-of-the-art robot learning models while embracing their imperfections, we present Sirius, a principled framework for humans and robots to collaborate through a division of work.

Decision Making

Learning and Retrieval from Prior Data for Skill-based Imitation Learning

no code implementations20 Oct 2022 Soroush Nasiriany, Tian Gao, Ajay Mandlekar, Yuke Zhu

Imitation learning offers a promising path for robots to learn general-purpose behaviors, but traditionally has exhibited limited scalability due to high data supervision requirements and brittle generalization.

Data Augmentation Imitation Learning +2

VIMA: General Robot Manipulation with Multimodal Prompts

2 code implementations6 Oct 2022 Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

This work shows that we can express a wide spectrum of robot manipulation tasks with multimodal prompts, interleaving textual and visual tokens.

Imitation Learning Language Modelling +2

Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments

no code implementations19 Sep 2022 Mingyo Seo, Ryan Gupta, Yifeng Zhu, Alexy Skoutnev, Luis Sentis, Yuke Zhu

We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision-making to predict navigation commands and low-level gait generation to realize the target commands.

Imitation Learning Reinforcement Learning (RL)

Causal Dynamics Learning for Task-Independent State Abstraction

no code implementations27 Jun 2022 Zizhao Wang, Xuesu Xiao, Zifan Xu, Yuke Zhu, Peter Stone

Learning dynamics models accurately is an important goal for Model-Based Reinforcement Learning (MBRL), but most MBRL methods learn a dense dynamics model which is vulnerable to spurious correlations and therefore generalizes poorly to unseen states.

Model-based Reinforcement Learning

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

1 code implementation CVPR 2022 Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Anima Anandkumar

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts.

Benchmarking Few-Shot Image Classification +5

COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles

1 code implementation CVPR 2022 Jiaxun Cui, Hang Qiu, Dian Chen, Peter Stone, Yuke Zhu

To evaluate our model, we develop AutoCastSim, a network-augmented driving simulation framework with example accident-prone scenarios.

Autonomous Driving

RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning

1 code implementation ICLR 2022 Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

This task remains challenging for current deep learning algorithms since it requires addressing three key technical problems jointly: 1) identifying object entities and their properties, 2) inferring semantic relations between pairs of entities, and 3) generalizing to novel object-relation combinations, i. e., systematic generalization.

Human-Object Interaction Detection Retrieval +4

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

no code implementations14 Mar 2022 Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability.

Contrastive Learning Deformable Object Manipulation

Ditto: Building Digital Twins of Articulated Objects from Interaction

no code implementations CVPR 2022 Zhenyu Jiang, Cheng-Chun Hsu, Yuke Zhu

We also apply Ditto to real-world objects and deploy the recreated digital twins in physical simulation.

Mixed Reality

Pre-Trained Language Models for Interactive Decision-Making

1 code implementation3 Feb 2022 Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu

Together, these results suggest that language modeling induces representations that are useful for modeling not just language, but also goals and plans; these representations can aid learning and generalization even outside of language processing.

Imitation Learning Language Modelling

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

no code implementations15 Nov 2021 Youngwoon Lee, Joseph J. Lim, Anima Anandkumar, Yuke Zhu

However, these approaches require larger state distributions to be covered as more policies are sequenced, and thus are limited to short skill sequences.

Reinforcement Learning (RL) Robot Manipulation

Reinforcement Learning in Factored Action Spaces using Tensor Decompositions

no code implementations27 Oct 2021 Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar

We present an extended abstract for the previously published work TESSERACT [Mahajan et al., 2021], which proposes a novel solution for Reinforcement Learning (RL) in large, factored action spaces using tensor decompositions.

Multi-agent Reinforcement Learning reinforcement-learning +1

Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks

1 code implementation7 Oct 2021 Soroush Nasiriany, Huihan Liu, Yuke Zhu

Realistic manipulation tasks require a robot to interact with an environment with a prolonged sequence of motor actions.

reinforcement-learning Reinforcement Learning (RL)

OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

1 code implementation2 Oct 2021 Josiah Wong, Viktor Makoviychuk, Anima Anandkumar, Yuke Zhu

Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.

Robot Manipulation

Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation

no code implementations28 Sep 2021 Yifeng Zhu, Peter Stone, Yuke Zhu

From the task structures of multi-task demonstrations, we identify skills based on the recurring patterns and train goal-conditioned sensorimotor policies with hierarchical imitation learning.

Imitation Learning Robot Manipulation

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation

1 code implementation6 Aug 2021 Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang, Rohun Kulkarni, Li Fei-Fei, Silvio Savarese, Yuke Zhu, Roberto Martín-Martín

Based on the study, we derive a series of lessons including the sensitivity to different algorithmic design choices, the dependence on the quality of the demonstrations, and the variability based on the stopping criteria due to the different objectives in training and evaluation.

Imitation Learning reinforcement-learning +2

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning

2 code implementations15 Jul 2021 Paul Pu Liang, Yiwei Lyu, Xiang Fan, Zetian Wu, Yun Cheng, Jason Wu, Leslie Chen, Peter Wu, Michelle A. Lee, Yuke Zhu, Ruslan Salakhutdinov, Louis-Philippe Morency

In order to accelerate progress towards understudied modalities and tasks while ensuring real-world robustness, we release MultiBench, a systematic and unified large-scale benchmark spanning 15 datasets, 10 modalities, 20 prediction tasks, and 6 research areas.

Representation Learning

Discovering Generalizable Skills via Automated Generation of Diverse Tasks

no code implementations26 Jun 2021 Kuan Fang, Yuke Zhu, Silvio Savarese, Li Fei-Fei

To encourage generalizable skills to emerge, our method trains each skill to specialize in the paired task and maximizes the diversity of the generated tasks.

Hierarchical Reinforcement Learning reinforcement-learning +1

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

1 code implementation17 Jun 2021 Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar

A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert.

Autonomous Driving Image Augmentation +2

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

no code implementations31 May 2021 Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar

Algorithms derived from Tesseract decompose the Q-tensor across agents and utilise low-rank tensor approximations to model agent interactions relevant to the task.

Learning Theory Multi-agent Reinforcement Learning +3

Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition

1 code implementation18 May 2021 Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Animashree Anandkumar

Specifically, we 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players.

Multi-agent Reinforcement Learning reinforcement-learning +2

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

1 code implementation4 Apr 2021 Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yuke Zhu

The experimental results in simulation and on the real robot have demonstrated that the use of implicit neural representations and joint learning of grasp affordance and 3D reconstruction have led to state-of-the-art grasping results.

3D Reconstruction Multi-Task Learning

Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

1 code implementation CVPR 2021 Yifan Sun, Yuke Zhu, Yuhan Zhang, Pengkun Zheng, Xi Qiu, Chi Zhang, Yichen Wei

%We argue that such flexibility is also important for deep metric learning, because different visual concepts indeed correspond to different semantic scales.

Metric Learning

A Coach-Player Framework for Dynamic Team Composition

no code implementations1 Jan 2021 Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Anima Anandkumar

The performance of our method is comparable or even better than the setting where all players have a full view of the environment, but no coach.

Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects

no code implementations22 Dec 2020 Xinlei Pan, Animesh Garg, Animashree Anandkumar, Yuke Zhu

Through experimentation and comparative study, we demonstrate the effectiveness of our approach in discovering robust and cost-efficient hand morphologies for grasping novel objects.

Human-in-the-Loop Imitation Learning using Remote Teleoperation

no code implementations12 Dec 2020 Ajay Mandlekar, Danfei Xu, Roberto Martín-Martín, Yuke Zhu, Li Fei-Fei, Silvio Savarese

We develop a simple and effective algorithm to train the policy iteratively on new data collected by the system that encourages the policy to learn how to traverse bottlenecks through the interventions.

Imitation Learning Robot Manipulation

Learning Multi-Arm Manipulation Through Collaborative Teleoperation

no code implementations12 Dec 2020 Albert Tung, Josiah Wong, Ajay Mandlekar, Roberto Martín-Martín, Yuke Zhu, Li Fei-Fei, Silvio Savarese

To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks.

Imitation Learning

Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors

no code implementations1 Dec 2020 Michelle A. Lee, Matthew Tan, Yuke Zhu, Jeannette Bohg

Using sensor data from multiple modalities presents an opportunity to encode redundant and complementary features that can be useful when one modality is corrupted or noisy.

Fast Uncertainty Quantification for Deep Object Pose Estimation

no code implementations16 Nov 2020 Guanya Shi, Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Fabio Ramos, Animashree Anandkumar, Yuke Zhu

Deep learning-based object pose estimators are often unreliable and overconfident especially when the input image is outside the training domain, for instance, with sim2real transfer.

Pose Estimation

Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion

no code implementations21 Sep 2020 Xingye Da, Zhaoming Xie, David Hoeller, Byron Boots, Animashree Anandkumar, Yuke Zhu, Buck Babich, Animesh Garg

We present a hierarchical framework that combines model-based control and reinforcement learning (RL) to synthesize robust controllers for a quadruped (the Unitree Laikago).

reinforcement-learning Reinforcement Learning (RL)

OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation

1 code implementation17 Aug 2020 Hongyu Ren, Yuke Zhu, Jure Leskovec, Anima Anandkumar, Animesh Garg

We propose a variational inference framework OCEAN to perform online task inference for compositional tasks.

Variational Inference

Spherical Feature Transform for Deep Metric Learning

no code implementations ECCV 2020 Yuke Zhu, Yan Bai, Yichen Wei

Consequently, the feature transform is performed by a rotation that respects the spherical data distributions.

Data Augmentation Metric Learning +1

Adaptive Procedural Task Generation for Hard-Exploration Problems

no code implementations ICLR 2021 Kuan Fang, Yuke Zhu, Silvio Savarese, Li Fei-Fei

To enable curriculum learning in the absence of a direct indicator of learning progress, we propose to train the task generator by balancing the agent's performance in the generated tasks and the similarity to the target tasks.

Scaling Robot Supervision to Hundreds of Hours with RoboTurk: Robotic Manipulation Dataset through Human Reasoning and Dexterity

no code implementations11 Nov 2019 Ajay Mandlekar, Jonathan Booher, Max Spero, Albert Tung, Anchit Gupta, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei

We evaluate the quality of our platform, the diversity of demonstrations in our dataset, and the utility of our dataset via quantitative and qualitative analysis.

Robot Manipulation

Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation

no code implementations29 Oct 2019 Kuan Fang, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei

The fundamental challenge of planning for multi-step manipulation is to find effective and plausible action sequences that lead to the task goal.

Variational Inference

KETO: Learning Keypoint Representations for Tool Manipulation

no code implementations26 Oct 2019 Zengyi Qin, Kuan Fang, Yuke Zhu, Li Fei-Fei, Silvio Savarese

For this purpose, we present KETO, a framework of learning keypoint representations of tool-based manipulation.


Causal Induction from Visual Observations for Goal Directed Tasks

2 code implementations3 Oct 2019 Suraj Nair, Yuke Zhu, Silvio Savarese, Li Fei-Fei

Causal reasoning has been an indispensable capability for humans and other intelligent animals to interact with the physical world.

Regression Planning Networks

1 code implementation NeurIPS 2019 Danfei Xu, Roberto Martín-Martín, De-An Huang, Yuke Zhu, Silvio Savarese, Li Fei-Fei

Recent learning-to-plan methods have shown promising results on planning directly from observation space.


DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs

1 code implementation28 Sep 2019 Yunbo Wang, Bo Liu, Jiajun Wu, Yuke Zhu, Simon S. Du, Li Fei-Fei, Joshua B. Tenenbaum

A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty.

Continuous Control

Situational Fusion of Visual Representation for Visual Navigation

no code implementations ICCV 2019 Bokui Shen, Danfei Xu, Yuke Zhu, Leonidas J. Guibas, Li Fei-Fei, Silvio Savarese

A complex visual navigation task puts an agent in different situations which call for a diverse range of visual perception abilities.

Visual Navigation

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning

no code implementations16 Aug 2019 De-An Huang, Danfei Xu, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei, Juan Carlos Niebles

The key technical challenge is that the symbol grounding is prone to error with limited training data and leads to subsequent symbolic planning failures.

Imitation Learning

RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation

no code implementations7 Nov 2018 Ajay Mandlekar, Yuke Zhu, Animesh Garg, Jonathan Booher, Max Spero, Albert Tung, Julian Gao, John Emmons, Anchit Gupta, Emre Orbay, Silvio Savarese, Li Fei-Fei

Imitation Learning has empowered recent advances in learning robotic manipulation tasks by addressing shortcomings of Reinforcement Learning such as exploration and reward specification.

Imitation Learning

Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration

no code implementations CVPR 2019 De-An Huang, Suraj Nair, Danfei Xu, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese, Juan Carlos Niebles

We hypothesize that to successfully generalize to unseen complex tasks from a single video demonstration, it is necessary to explicitly incorporate the compositional structure of the tasks into the model.

Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision

no code implementations25 Jun 2018 Kuan Fang, Yuke Zhu, Animesh Garg, Andrey Kurenkov, Viraj Mehta, Li Fei-Fei, Silvio Savarese

We perform both simulated and real-world experiments on two tool-based manipulation tasks: sweeping and hammering.

Neural Task Programming: Learning to Generalize Across Hierarchical Tasks

1 code implementation4 Oct 2017 Danfei Xu, Suraj Nair, Yuke Zhu, Julian Gao, Animesh Garg, Li Fei-Fei, Silvio Savarese

In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction.

Few-Shot Learning Program induction +1

Scene Graph Generation by Iterative Message Passing

3 code implementations CVPR 2017 Danfei Xu, Yuke Zhu, Christopher B. Choy, Li Fei-Fei

In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image.

Graph Generation Panoptic Scene Graph Generation

Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

2 code implementations16 Sep 2016 Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, Ali Farhadi

To address the second issue, we propose AI2-THOR framework, which provides an environment with high-quality 3D scenes and physics engine.

3D Reconstruction Feature Engineering +3

Action Recognition by Hierarchical Mid-level Action Elements

no code implementations ICCV 2015 Tian Lan, Yuke Zhu, Amir Roshan Zamir, Silvio Savarese

Realistic videos of human actions exhibit rich spatiotemporal structures at multiple levels of granularity: an action can always be decomposed into multiple finer-grained elements in both space and time.

Action Parsing Action Recognition +1

Building a Large-scale Multimodal Knowledge Base System for Answering Visual Queries

no code implementations20 Jul 2015 Yuke Zhu, Ce Zhang, Christopher Ré, Li Fei-Fei

The complexity of the visual world creates significant challenges for comprehensive visual understanding.


Cannot find the paper you are looking for? You can Submit a new open access paper.