no code implementations • 24 Mar 2025 • Yihan Chen, Wenfei Yang, Huan Ren, Shifeng Zhang, Tianzhu Zhang, Feng Wu
Despite the success of existing 3D correspondence-based methods, the reliance on explicit feature matching suffers from small overlaps in visible regions and unreliable feature estimation for invisible regions.
1 code implementation • International Conference on Learning Representations 2025 • Chuxin Wang, Wenfei Yang, Xiang Liu, Tianzhu Zhang
To the best of our knowledge, this is the first method to model queries as system states and scene points as system inputs, which can simultaneously update scene point features and query features with linear complexity.
Ranked #1 on
3D Object Detection
on ScanNetV2
no code implementations • 18 Mar 2025 • Huan Ren, Wenfei Yang, Xiang Liu, Shifeng Zhang, Tianzhu Zhang
Category-level object pose estimation aims to determine the pose and size of novel objects in specific categories.
1 code implementation • IEEE Transactions on Image Processing 2024 • Chuxin Wang, Yixin Zha, Jianfeng He, Wenfei Yang, Tianzhu Zhang
Recently, masked point modeling-based methods have shown significant performance improvements for point cloud understanding, yet these methods rely on overlapping grouping strategies (k-nearest neighbor algorithm) resulting in early leakage of structural information of mask groups, and overlook the semantic modeling of object components resulting in parts with the same semantics having obvious feature differences due to position differences.
Ranked #7 on
Few-Shot 3D Point Cloud Classification
on ModelNet40 5-way (20-shot)
(using extra training data)
3D Part Segmentation
Few-Shot 3D Point Cloud Classification
+1
1 code implementation • 17 Oct 2024 • Jiahao Lu, Jiacheng Deng, Ruijie Zhu, Yanzhe Liang, Wenfei Yang, Tianzhu Zhang, Xu Zhou
Dynamic scenes rendering is an intriguing yet challenging problem.
1 code implementation • 10 Oct 2024 • Ruijie Zhu, Yanzhe Liang, Hanzhi Chang, Jiacheng Deng, Jiahao Lu, Wenfei Yang, Tianzhu Zhang, Yongdong Zhang
Specifically, we first introduce an optical flow decoupling module that decouples optical flow into camera flow and motion flow, corresponding to camera movement and object motion respectively.
2 code implementations • 4 Sep 2024 • Li Liu, Ruijie Zhu, Jiacheng Deng, Ziyang Song, Wenfei Yang, Tianzhu Zhang
Specifically, in the proposed plane guided depth generator (PGDG), we design a set of plane queries as prototypes to softly model planes in the scene and predict per-pixel plane coefficients.
no code implementations • 25 Jun 2024 • Zhuoyuan Li, Yubo Ai, Jiahao Lu, Chuxin Wang, Jiacheng Deng, Hanzhi Chang, Yanzhe Liang, Wenfei Yang, Shifeng Zhang, Tianzhu Zhang
Transformers have demonstrated impressive results for 3D point cloud semantic segmentation.
Ranked #6 on
3D Semantic Segmentation
on ScanNet200
1 code implementation • CVPR 2024 • Xiao Lin, Wenfei Yang, Yuan Gao, Tianzhu Zhang
(2) The second design is a Geometric-Aware Feature Aggregation module, which can efficiently integrate the local and global geometric information into keypoint features.
no code implementations • 7 Mar 2024 • Yu Zhu, Chuxiong Sun, Wenfei Yang, Wenqiang Wei, Bo Tang, Tianzhu Zhang, Zhiyu Li, Shifeng Zhang, Feiyu Xiong, Jie Hu, MingChuan Yang
Reinforcement Learning from Human Feedback (RLHF) is the prevailing approach to ensure Large Language Models (LLMs) align with human values.
1 code implementation • 20 Jan 2024 • Yinchao Ma, Yuyang Tang, Wenfei Yang, Tianzhu Zhang, Jinpeng Zhang, Mengxue Kang
Single object tracking aims to locate the target object in a video sequence according to the state specified by different modal references, including the initial bounding box (BBOX), natural language (NL), or both (NL+BBOX).
Ranked #2 on
Visual Object Tracking
on AVisT
no code implementations • 7 Jan 2024 • Xianghui Xie, Xi Wang, Nikos Athanasiou, Bharat Lal Bhatnagar, Chun-Hao P. Huang, Kaichun Mo, Hao Chen, Xia Jia, Zerui Zhang, Liangxian Cui, Xiao Lin, Bingqiao Qian, Jie Xiao, Wenfei Yang, Hyeongjin Nam, Daniel Sungho Jung, Kihoon Kim, Kyoung Mu Lee, Otmar Hilliges, Gerard Pons-Moll
Modeling the interaction between humans and objects has been an emerging research direction in recent years.
1 code implementation • ICCV 2023 • Chuxin Wang, Wenfei Yang, Tianzhu Zhang
Semi-supervised 3D object detection from point cloud aims to train a detector with a small number of labeled data and a large number of unlabeled data.
1 code implementation • CVPR 2023 • Huan Ren, Wenfei Yang, Tianzhu Zhang, Yongdong Zhang
Weakly-supervised temporal action localization aims to localize and recognize actions in untrimmed videos with only video-level category labels during training.
Ranked #3 on
Weakly Supervised Action Localization
on THUMOS’14
Multiple Instance Learning
Weakly Supervised Action Localization
+2
no code implementations • CVPR 2021 • Wenfei Yang, Tianzhu Zhang, Xiaoyuan Yu, Tian Qi, Yongdong Zhang, Feng Wu
To alleviate this problem, we propose a novel Uncertainty Guided Collaborative Training (UGCT) strategy, which mainly includes two key designs: (1) The first design is an online pseudo label generation module, in which the RGB and FLOW streams work collaboratively to learn from each other.
no code implementations • CVPR 2021 • Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang
In this paper, we present an Action Unit Memory Network (AUMN) for weakly supervised temporal action localization, which can mitigate the above two challenges by learning an action unit memory bank.
Ranked #7 on
Weakly Supervised Action Localization
on THUMOS14