Search Results for author: Arsalan Mousavian

Found 32 papers, 15 papers with code

RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics

no code implementations15 Jun 2024 Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

In spite of the recent adoption of vision language models (VLMs) to control robot behavior, VLMs struggle to precisely articulate robot actions using language.

Language Modelling Robot Navigation +2

M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place

no code implementations2 Nov 2023 Wentao Yuan, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

With the advent of large language models and large-scale robotic datasets, there has been tremendous progress in high-level decision-making for object manipulation.

Decision Making valid

CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation

no code implementations18 Apr 2023 Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox

CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene.

Navigate Object +1

MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare

no code implementations13 Dec 2022 Yann Labbé, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, Josef Sivic

Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.

6D Pose Estimation Object

DexTransfer: Real World Multi-fingered Dexterous Grasping with Minimal Human Demonstrations

no code implementations28 Sep 2022 Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox

The policy learned from our dataset can generalize well on unseen object poses in both simulation and the real world


ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

no code implementations22 Sep 2022 Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg

To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information.

IFOR: Iterative Flow Minimization for Robotic Object Rearrangement

no code implementations CVPR 2022 Ankit Goyal, Arsalan Mousavian, Chris Paxton, Yu-Wei Chao, Brian Okorn, Jia Deng, Dieter Fox

Accurate object rearrangement from vision is a crucial problem for a wide variety of real-world robotics applications in unstructured environments.

Object Optical Flow Estimation

RICE: Refining Instance Masks in Cluttered Environments with Graph Neural Networks

1 code implementation29 Jun 2021 Christopher Xie, Arsalan Mousavian, Yu Xiang, Dieter Fox

We postulate that a network architecture that encodes relations between objects at a high-level can be beneficial.

Graph Neural Network

NeRP: Neural Rearrangement Planning for Unknown Objects

no code implementations2 Jun 2021 Ahmed H. Qureshi, Arsalan Mousavian, Chris Paxton, Michael C. Yip, Dieter Fox

We propose NeRP (Neural Rearrangement Planning), a deep learning based approach for multi-step neural object rearrangement planning which works with never-before-seen objects, that is trained on simulation data, and generalizes to the real world.

RGB-D Local Implicit Function for Depth Completion of Transparent Objects

1 code implementation CVPR 2021 Luyang Zhu, Arsalan Mousavian, Yu Xiang, Hammad Mazhar, Jozef van Eenbergen, Shoubhik Debnath, Dieter Fox

Key to our approach is a local implicit neural representation built on ray-voxel pairs that allows our method to generalize to unseen objects and achieve fast inference speed.

Depth Completion Depth Estimation +1

Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes

1 code implementation25 Mar 2021 Martin Sundermeyer, Arsalan Mousavian, Rudolph Triebel, Dieter Fox

Our novel grasp representation treats 3D points of the recorded point cloud as potential grasp contacts.

Grasp Generation Robotic Grasping

Object Rearrangement Using Learned Implicit Collision Functions

1 code implementation21 Nov 2020 Michael Danielczuk, Arsalan Mousavian, Clemens Eppner, Dieter Fox

The learned model outperforms both traditional pipelines and learned ablations by 9. 8% in accuracy on a dataset of simulated collision queries and is 75x faster than the best-performing baseline.


ACRONYM: A Large-Scale Grasp Dataset Based on Simulation

2 code implementations18 Nov 2020 Clemens Eppner, Arsalan Mousavian, Dieter Fox

We introduce ACRONYM, a dataset for robot grasp planning based on physics simulation.

Reactive Long Horizon Task Execution via Visual Skill and Precondition Models

no code implementations17 Nov 2020 Shohin Mukherjee, Chris Paxton, Arsalan Mousavian, Adam Fishman, Maxim Likhachev, Dieter Fox

Zero-shot execution of unseen robotic tasks is important to allowing robots to perform a wide variety of tasks in human environments, but collecting the amounts of data necessary to train end-to-end policies in the real-world is often infeasible.

Reactive Human-to-Robot Handovers of Arbitrary Objects

no code implementations17 Nov 2020 Wei Yang, Chris Paxton, Arsalan Mousavian, Yu-Wei Chao, Maya Cakmak, Dieter Fox

We demonstrate the generalizability, usability, and robustness of our approach on a novel benchmark set of 26 diverse household objects, a user study with naive users (N=6) handing over a subset of 15 objects, and a systematic evaluation examining different ways of handing objects.

Grasp Generation Motion Planning

Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds

1 code implementation2 Oct 2020 Lirui Wang, Yu Xiang, Wei Yang, Arsalan Mousavian, Dieter Fox

We demonstrate that our learned policy can be integrated into a tabletop 6D grasping system and a human-robot handover system to improve the grasping performance of unseen objects.

Imitation Learning Motion Planning +2

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

1 code implementation30 Jul 2020 Yu Xiang, Christopher Xie, Arsalan Mousavian, Dieter Fox

In this work, we propose a new method for unseen object instance segmentation by learning RGB-D feature embeddings from synthetic data.

Clustering Metric Learning +4

A Billion Ways to Grasp: An Evaluation of Grasp Sampling Schemes on a Dense, Physics-based Grasp Data Set

no code implementations11 Dec 2019 Clemens Eppner, Arsalan Mousavian, Dieter Fox

With the increasing speed and quality of physics simulations, generating large-scale grasping data sets that feed learning algorithms is becoming more and more popular.

Self-supervised 6D Object Pose Estimation for Robot Manipulation

3 code implementations23 Sep 2019 Xinke Deng, Yu Xiang, Arsalan Mousavian, Clemens Eppner, Timothy Bretl, Dieter Fox

In this way, our system is able to continuously collect data and improve its pose estimation modules.


The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation

no code implementations30 Jul 2019 Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox

We show that our method, trained on this dataset, can produce sharp and accurate masks, outperforming state-of-the-art methods on unseen object instance segmentation.

Object Segmentation +2

PoseRBPF: A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

1 code implementation22 May 2019 Xinke Deng, Arsalan Mousavian, Yu Xiang, Fei Xia, Timothy Bretl, Dieter Fox

In this work, we formulate the 6D object pose tracking problem in the Rao-Blackwellized particle filtering framework, where the 3D rotation and the 3D translation of an object are decoupled.

6D Pose Estimation 6D Pose Estimation using RGB +3

Visual Representations for Semantic Target Driven Navigation

3 code implementations15 May 2018 Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, Ayzaan Wahid, James Davidson

We propose to using high level semantic and contextual features including segmentation and detection masks obtained by off-the-shelf state-of-the-art vision as observations and use deep network to learn the navigation policy.

Domain Adaptation Visual Navigation

Synthesizing Training Data for Object Detection in Indoor Scenes

no code implementations25 Feb 2017 Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka

In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection.

Object object-detection +1

3D Bounding Box Estimation Using Deep Learning and Geometry

11 code implementations CVPR 2017 Arsalan Mousavian, Dragomir Anguelov, John Flynn, Jana Kosecka

In contrast to current techniques that only regress the 3D orientation of an object, our method first regresses relatively stable 3D object properties using a deep convolutional neural network and then combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D bounding box.

3D Object Detection Object +4

Multiview RGB-D Dataset for Object Instance Detection

no code implementations26 Sep 2016 Georgios Georgakis, Md. Alimoor Reza, Arsalan Mousavian, Phi-Hung Le, Jana Kosecka

This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset.

Object object-detection +1

Semantic Image Based Geolocation Given a Map

no code implementations1 Sep 2016 Arsalan Mousavian, Jana Kosecka

In this work we present an approach for geo-locating a novel view and determining camera location and orientation using a map and a sparse set of geo-tagged reference views.

Visual Place Recognition

Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks

no code implementations25 Apr 2016 Arsalan Mousavian, Hamed Pirsiavash, Jana Kosecka

The proposed model is trained and evaluated on NYUDepth V2 dataset outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.

Depth Estimation Segmentation +1

Deep Convolutional Features for Image Based Retrieval and Scene Categorization

no code implementations20 Sep 2015 Arsalan Mousavian, Jana Kosecka

Several recent approaches showed how the representations learned by Convolutional Neural Networks can be repurposed for novel tasks.

Image Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.