no code implementations • 5 Apr 2024 • Xingyu Liu, Chenyangguang Zhang, Gu Wang, Ruida Zhang, Xiangyang Ji
In robotic vision, a de-facto paradigm is to learn in simulated environments and then transfer to real-world applications, which poses an essential challenge in bridging the sim-to-real domain gap.
no code implementations • 14 Mar 2024 • Tomas Hodan, Martin Sundermeyer, Yann Labbe, Van Nguyen Nguyen, Gu Wang, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Jiri Matas
In the new tasks, methods were required to learn new objects during a short onboarding stage (max 5 minutes, 1 GPU) from provided 3D object models.
no code implementations • 23 Nov 2023 • Bowen Fu, Gu Wang, Chenyangguang Zhang, Yan Di, Ziqin Huang, Zhiying Leng, Fabian Manhardt, Xiangyang Ji, Federico Tombari
Reconstructing hand-held objects from a single RGB image is a challenging task in computer vision.
no code implementations • CVPR 2024 • Chenyangguang Zhang, Guanlong Jiao, Yan Di, Gu Wang, Ziqin Huang, Ruida Zhang, Fabian Manhardt, Bowen Fu, Federico Tombari, Xiangyang Ji
Previous works concerning single-view hand-held object reconstruction typically rely on supervision from 3D ground-truth models, which are hard to collect in real world.
no code implementations • 25 Feb 2023 • Martin Sundermeyer, Tomas Hodan, Yann Labbe, Gu Wang, Eric Brachmann, Bertram Drost, Carsten Rother, Jiri Matas
In 2022, we witnessed another significant improvement in the pose estimation accuracy -- the state of the art, which was 56. 9 AR$_C$ in 2019 (Vidal et al.) and 69. 8 AR$_C$ in 2020 (CosyPose), moved to new heights of 83. 7 AR$_C$ (GDRNPP).
1 code implementation • 17 Jul 2022 • Xingyu Liu, Gu Wang, Yi Li, Xiangyang Ji
While category-level 9DoF object pose estimation has emerged recently, previous correspondence-based or direct regression methods are both limited in accuracy due to the huge intra-category variances in object shape and color, etc.
1 code implementation • 19 Mar 2022 • Gu Wang, Fabian Manhardt, Xingyu Liu, Xiangyang Ji, Federico Tombari
6D object pose estimation is a fundamental yet challenging problem in computer vision.
2 code implementations • ICCV 2021 • Yan Di, Fabian Manhardt, Gu Wang, Xiangyang Ji, Nassir Navab, Federico Tombari
Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e. g. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem.
Ranked #1 on 6D Pose Estimation using RGB on Occlusion LineMOD
no code implementations • CVPR 2020 • Jianzhun Shao, Yuhang Jiang, Gu Wang, Zhigang Li, Xiangyang Ji
6D pose estimation from a single RGB image is a challenging and vital task in computer vision.
1 code implementation • CVPR 2021 • Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji
In this work, we perform an in-depth investigation on both direct and indirect methods, and propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations.
Ranked #3 on 6D Pose Estimation using RGB on Occlusion LineMOD
1 code implementation • ECCV 2020 • Gu Wang, Fabian Manhardt, Jianzhun Shao, Xiangyang Ji, Nassir Navab, Federico Tombari
6D object pose estimation is a fundamental problem in computer vision.
no code implementations • 12 Mar 2020 • Fabian Manhardt, Gu Wang, Benjamin Busam, Manuel Nickel, Sven Meier, Luca Minciullo, Xiangyang Ji, Nassir Navab
Contemporary monocular 6D pose estimation methods can only cope with a handful of object instances.
2 code implementations • ECCV 2018 • Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, Dieter Fox
Estimating the 6D pose of objects from images is an important problem in various applications such as robot manipulation and virtual reality.
Ranked #1 on 6D Pose Estimation using RGB on YCB-Video
no code implementations • 9 Jul 2016 • Jialin Wu, Gu Wang, Wukui Yang, Xiangyang Ji
We propose a novel deep supervised neural network for the task of action recognition in videos, which implicitly takes advantage of visual tracking and shares the robustness of both deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN).
Action Recognition In Videos Temporal Action Localization +1