Open6DOR: Benchmarking Open-instruction 6-DoF Object Rearrangement and A VLM-based Approach

In this work, we propel the pioneer construction of the benchmark and approach for table-top Open-instruction 6-DoF Object Rearrangement (Open6DOR). Specifically, we collect a synthetic dataset of 200+ objects and carefully design 2400+ Open6DOR tasks. These tasks are divided into the Position-track, Rotation-track, and 6-DoF-track for evaluating different embodied agents in predicting the positions and rotations of target objects. Besides, we also propose a VLM-based approach for Open6DOR, named Open6DOR-GPT, which empowers GPT-4V with 3D-awareness and simulation-assistance while exploiting its strengths in generalizability and instruction-following for this task. We compare the existing embodied agents with our Open6DOR-GPT on the proposed Open6DOR benchmark and find that Open6DOR-GPT achieves the state-of-the-art performance. We further show the impressive performance of Open6DOR-GPT in diverse real-world experiments. We plan to release the final version of the benchmark, along with our refined method, in early September, and we recommend waiting until then to download the dataset.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Object Rearrangement Open6DOR V2 Open6DOR pos-level0 60.3 # 2
6-DoF 35.6 # 2
pos-level1 78.6 # 2
rot-level0 45.7 # 2
rot-level1 32.5 # 2
rot-level2 49.8 # 2

Methods


No methods listed for this paper. Add relevant methods here