Search Results for author: Zhangyang Qi

Found 3 papers, 2 papers with code

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases

1 code implementation22 Dec 2023 Zhangyang Qi, Ye Fang, Mengchen Zhang, Zeyi Sun, Tong Wu, Ziwei Liu, Dahua Lin, Jiaqi Wang, Hengshuang Zhao

We conducted a series of structured experiments to evaluate their performance in various industrial application scenarios, offering a comprehensive perspective on their practical utility.

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

1 code implementation5 Dec 2023 Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao

Multimodal Large Language Models (MLLMs) have excelled in 2D image-text comprehension and image generation, but their understanding of the 3D world is notably deficient, limiting progress in 3D language understanding and generation.

3D Generation Reading Comprehension

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

no code implementations2 Jun 2023 Zhangyang Qi, Jiaqi Wang, Xiaoyang Wu, Hengshuang Zhao

Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.

3D Object Detection Autonomous Driving +2

Cannot find the paper you are looking for? You can Submit a new open access paper.