Search Results for author: Yiwen Tang

Found 5 papers, 4 papers with code

Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding

5 code implementations • 11 Apr 2024 • Yiwen Tang, Jiaming Liu, Dong Wang, Zhigang Wang, Shanghang Zhang, Bin Zhao, Xuelong Li

The adapter incorporates prior spatial knowledge from the source modality to guide the local feature aggregation of 3D tokens, compelling the semantic adaption of any-modality transformers.

Paper
Code

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

5 code implementations • 4 Oct 2023 • Yiwen Tang, Ray Zhang, Zoey Guo, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

To this end, we introduce Point-PEFT, a novel framework for adapting point cloud pre-trained models with minimal learnable parameters.

Paper
Code

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

5 code implementations • 1 Sep 2023 • Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Yiwen Tang, Xianzheng Ma, Jiaming Han, Kexin Chen, Peng Gao, Xianzhi Li, Hongsheng Li, Pheng-Ann Heng

We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video.

Ranked #5 on 3D Question Answering (3D-QA) on 3D MM-Vet

3D Generation 3D Question Answering (3D-QA) +4

382

Paper
Code

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance

7 code implementations • 29 Mar 2023 • Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.

Visual Grounding

Paper
Code

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding

no code implementations • ICCV 2023 • Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.

Visual Grounding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.