48 papers with code • 0 benchmarks • 4 datasets
In the first stage, we train a CNN to map each pixel to an embedding space where pixels from the same plane instance have similar embeddings.
Ranked #1 on Plane Instance Segmentation on NYU Depth v2
We propose to (i) rethink pairwise interactions with a self-attention mechanism, and (ii) jointly model Human-Robot as well as Human-Human interactions in the deep reinforcement learning framework.
We further incorporate our proposed RT-BENE baselines in the recently presented RT-GENE gaze estimation framework where it provides a real-time inference of the openness of the eyes.
Ranked #1 on Blink estimation on RT-BENE
Semantic SLAM is an important field in autonomous driving and intelligent agents, which can enable robots to achieve high-level navigation tasks, obtain simple cognition or reasoning ability and achieve language-based human-robot-interaction.
This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space (i. e. a 2D space used for texture mapping of 3D mesh).
In this paper, we study the problem of recovering 3D planar surfaces from a single image of man-made environment.
The formulation is designed to identify and to disregard dynamic objects in order to obtain a medium-term invariant map representation.