no code implementations • 5 Oct 2023 • Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Hanwen Jiang, Dejia Xu, Zehao Zhu, Dilin Wang, Zhangyang Wang
To address this challenge, we introduce PF-GRT, a new Pose-Free framework for Generalizable Rendering Transformer, eliminating the need for pre-computed camera poses and instead leveraging feature-matching learned directly from data.
1 code implementation • 2 Oct 2023 • Hanwen Jiang, Zhenyu Jiang, Yue Zhao, QiXing Huang
Are camera poses necessary for multi-view 3D modeling?
no code implementations • 26 Sep 2023 • Zhenyu Jiang, Hanwen Jiang, Yuke Zhu
Incorporating semantic priors with self-supervised flow training, Doduo produces accurate dense correspondence robust to the dynamic changes of the scenes.
1 code implementation • 25 May 2023 • Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Dejia Xu, Hanwen Jiang, Zhangyang Wang
To mitigate this issue, we propose a general paradigm for object pose estimation, called Promptable Object Pose Estimation (POPE).
no code implementations • 8 Dec 2022 • Hanwen Jiang, Zhenyu Jiang, Kristen Grauman, Yuke Zhu
The reconstruction results under predicted poses are comparable to the ones using ground-truth poses.
1 code implementation • 12 Aug 2021 • Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang
While significant progress has been made on understanding hand-object interactions in computer vision, it is still very challenging for robots to perform complex dexterous manipulation.
no code implementations • CVPR 2021 • Shaowei Liu, Hanwen Jiang, Jiarui Xu, Sifei Liu, Xiaolong Wang
Estimating 3D hand and object pose from a single image is an extremely challenging problem: hands and objects are often self-occluded during interactions, and the 3D annotations are scarce as even humans cannot directly label the ground-truths from a single image perfectly.
Ranked #7 on hand-object pose on HO-3D
no code implementations • ICCV 2021 • Hanwen Jiang, Shaowei Liu, Jiashun Wang, Xiaolong Wang
Based on the hand-object contact consistency, we design novel objectives in training the human grasp generation model and also a new self-supervised task which allows the grasp generation network to be adjusted even during test time.
2 code implementations • 6 Mar 2019 • Qin Zou, Hanwen Jiang, Qiyu Dai, Yuanhao Yue, Long Chen, Qian Wang
Specifically, information of each frame is abstracted by a CNN block, and the CNN features of multiple continuous frames, holding the property of time-series, are then fed into the RNN block for feature learning and lane prediction.