1 code implementation • 30 Nov 2023 • Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun, Chen Chen
Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.
Ranked #2 on Zero-Shot Action Recognition on Kinetics
1 code implementation • 23 Jun 2023 • Tom Tongjia Chen, Hongshan Yu, Zhengeng Yang, Ming Li, Zechuan Li, Jingwen Wang, Wei Miao, Wei Sun, Chen Chen
Affordance-Centric Question-driven Task Completion (AQTC) has been proposed to acquire knowledge from videos to furnish users with comprehensive and systematic instructions.
no code implementations • 8 Mar 2023 • Yong He, Hongshan Yu, Zhengeng Yang, Wei Sun, Mingtao Feng, Ajmal Mian
Local features and contextual dependencies are crucial for 3D point cloud analysis.
no code implementations • 8 Mar 2023 • Yong He, Hongshan Yu, Zhengeng Yang, Xiaoyan Liu, Wei Sun, Ajmal Mian
In particular, we achieve state-of-the-art semantic segmentation results of 76% mIoU on S3DIS 6-fold and 72. 2% on S3DIS Area5.
1 code implementation • CVPR 2023 • Zechuan Li, Hongshan Yu, Zhengeng Yang, Tongjia Chen, Naveed Akhtar
In this work, we propose AShapeFormer, a semantics-guided object-level shape encoding module for 3D object detection.
no code implementations • 12 Aug 2022 • Zhengeng Yang, Hongshan Yu, Wei Sun, Li-Cheng, Ajmal Mian
In this paper, we present an easy-to-train framework that learns domain-invariant prototypes for domain adaptive semantic segmentation.
no code implementations • 9 Mar 2021 • Yong He, Hongshan Yu, Xiaoyan Liu, Zhengeng Yang, Wei Sun, Ajmal Mian
This paper fills the gap and provides a comprehensive survey of the recent progress made in deep learning based 3D segmentation.
no code implementations • 18 Dec 2020 • Zhengeng Yang, Hongshan Yu, Yong He, Zhi-Hong Mao, Ajmal Mian
By learning to solve a Jigsaw Puzzle problem with 25 patches and transferring the learned features to semantic segmentation task on Cityscapes dataset, we achieve a 5. 8 percentage point improvement over the baseline model that initialized from random values.
no code implementations • 22 Aug 2020 • Qiang Fu, Hongshan Yu, Xiaolong Wang, Zhengeng Yang, Hong Zhang, Ajmal Mian
ORB-SLAM2 \cite{orbslam2} is a benchmark method in this domain, however, it consumes significant time for computing descriptors that never get reused unless a frame is selected as a keyframe.
Robotics Computational Geometry I.4.0; I.4.9
no code implementations • 16 Mar 2019 • Zhengeng Yang, Hongshan Yu, Qiang Fu, Wei Sun, Wenyan Jia, Mingui Sun, Zhi-Hong Mao
The rapid development of autonomous driving in recent years presents lots of challenges for scene understanding.