…Using Colosseum, we compare 4 state-of-the-art manipulation models to reveal that their success rate degrades between 30-50% across these perturbation factors.
1 PAPER • 1 BENCHMARK