3D-MVP: 3D Multiview Pretraining for Robotic Manipulation

no code implementations26 Jun 2024 Shengyi Qian, Kaichun Mo, Valts Blukis, David F. Fouhey, Dieter Fox, Ankit Goyal

Our results suggest that 3D-aware pretraining is a promising approach to improve sample efficiency and generalization of vision-based robotic manipulation policies.

Decoder Robot Manipulation +1

RVT-2: Learning Precise Manipulation from Few Demonstrations

1 code implementation12 Jun 2024 Ankit Goyal, Valts Blukis, Jie Xu, Yijie Guo, Yu-Wei Chao, Dieter Fox

In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given language instructions.

Robot Manipulation Generalization

AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent

no code implementations11 Apr 2024 Tongzhou Mu, Yijie Guo, Jie Xu, Ankit Goyal, Hao Su, Dieter Fox, Animesh Garg

Encouraged by the remarkable achievements of language and vision foundation models, developing generalist robotic agents through imitation learning, using large demonstration datasets, has become a prominent area of interest in robot learning.

Imitation Learning

Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement

no code implementations10 Jul 2023 Anthony Simeonov, Ankit Goyal, Lucas Manuelli, Lin Yen-Chen, Alina Sarmiento, Alberto Rodriguez, Pulkit Agrawal, Dieter Fox

We propose a system for rearranging objects in a scene to achieve a desired object-scene placing relationship, such as a book inserted in an open slot of a bookshelf.

RVT: Robotic View Transformer for 3D Object Manipulation

1 code implementation26 Jun 2023 Ankit Goyal, Jie Xu, Yijie Guo, Valts Blukis, Yu-Wei Chao, Dieter Fox

In simulations, we find that a single RVT model works well across 18 RLBench tasks with 249 task variations, achieving 26% higher relative success than the existing state-of-the-art method (PerAct).

Object Robot Manipulation

ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

no code implementations22 Sep 2022 Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg

To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information.

Task Planning

Coupled Iterative Refinement for 6D Multi-Object Pose Estimation

1 code implementation CVPR 2022 Lahav Lipson, Zachary Teed, Ankit Goyal, Jia Deng

We propose a new approach to 6D object pose estimation which consists of an end-to-end differentiable architecture that makes use of geometric knowledge.

6D Pose Estimation using RGB Object

IFOR: Iterative Flow Minimization for Robotic Object Rearrangement

no code implementations CVPR 2022 Ankit Goyal, Arsalan Mousavian, Chris Paxton, Yu-Wei Chao, Brian Okorn, Jia Deng, Dieter Fox

Accurate object rearrangement from vision is a crucial problem for a wide variety of real-world robotics applications in unstructured environments.

Object Optical Flow Estimation

Emotions are Subtle: Learning Sentiment Based Text Representations Using Contrastive Learning

no code implementations2 Dec 2021 Ipsita Mohanty, Ankit Goyal, Alex Dotterweich

Contrastive learning techniques have been widely used in the field of computer vision as a means of augmenting datasets.

Contrastive Learning Sentiment Analysis

Non-deep Networks

4 code implementations14 Oct 2021 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun

This begs the question -- is it possible to build high-performing "non-deep" neural networks?

Image Classification Real-Time Object Detection

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline

3 code implementations9 Jun 2021 Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng

It also outperforms state-of-the-art methods on ScanObjectNN, a real-world point cloud benchmark, and demonstrates better cross-dataset generalization.

Point Cloud Classification

Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D

2 code implementations NeurIPS 2020 Ankit Goyal, Kaiyu Yang, Dawei Yang, Jia Deng

The 3D scenes in our dataset come in minimally contrastive pairs: two scenes in a pair are almost identical, but a spatial relation holds in one and fails in the other.

Relation Spatial Relation Recognition

PackIt: A Virtual Environment for Geometric Planning

1 code implementation ICML 2020 Ankit Goyal, Jia Deng

The ability to jointly understand the geometry of objects and plan actions for manipulating them is crucial for intelligent agents.

Robot Task Planning

