1 code implementation • 30 Jun 2023 • Theophile Gervet, Zhou Xian, Nikolaos Gkanatsios, Katerina Fragkiadaki
3D perceptual representations are well suited for robot manipulation as they easily encode occlusions and simplify spatial reasoning.
Ranked #1 on
Robot Manipulation
on RLBench
no code implementations • 27 Apr 2023 • Nikolaos Gkanatsios, Ayush Jain, Zhou Xian, Yunchu Zhang, Christopher Atkeson, Katerina Fragkiadaki
Language is compositional; an instruction can express multiple relation constraints to hold among objects in a scene that a robot is tasked to rearrange.
no code implementations • 27 Apr 2023 • Nikolaos Gkanatsios, Mayank Singh, Zhaoyuan Fang, Shubham Tulsiani, Katerina Fragkiadaki
We present Analogical Networks, a model that encodes domain knowledge explicitly, in a collection of structured labelled 3D scenes, in addition to implicitly, as model parameters, and segments 3D object scenes with analogical reasoning: instead of mapping a scene to part segments directly, our model first retrieves related scenes from memory and their corresponding part structures, and then predicts analogous part structures for the input scene, via an end-to-end learnable modulation mechanism.
1 code implementation • 16 Dec 2021 • Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki
We propose a language grounding model that attends on the referential utterance and on the object proposal pool computed from a pre-trained detector to decode referenced objects with a detection head, without selecting them from the pool.
no code implementations • 29 Sep 2021 • Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki
Object detectors are typically trained on a fixed vocabulary of objects and attributes that is often too restrictive for open-domain language grounding, where the language utterance may refer to visual entities in various levels of abstraction, such as a cat, the leg of a cat, or the stain on the front leg of the chair.
1 code implementation • ICCV 2021 • Markos Diomataris, Nikolaos Gkanatsios, Vassilis Pitsikalis, Petros Maragos
Scene Graph Generators (SGGs) are models that, given an image, build a directed graph where each edge represents a predicted subject predicate object triplet.
1 code implementation • 9 Jun 2020 • Georgia Chalvatzaki, Nikolaos Gkanatsios, Petros Maragos, Jan Peters
Inherent morphological characteristics in objects may offer a wide range of plausible grasping orientations that obfuscates the visual learning of robotic grasping.
1 code implementation • 15 Feb 2019 • Nikolaos Gkanatsios, Vassilis Pitsikalis, Petros Koutras, Athanasia Zlatintsi, Petros Maragos
Detecting visual relationships, i. e. <Subject, Predicate, Object> triplets, is a challenging Scene Understanding task approached in the past via linguistic priors or spatial information in a single feature branch.