Search Results for author: Ray Zhang

Found 11 papers, 6 papers with code

Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding

7 code implementations11 Apr 2024 Yiwen Tang, Ray Zhang, Jiaming Liu, Zoey Guo, Dong Wang, Zhigang Wang, Bin Zhao, Shanghang Zhang, Peng Gao, Hongsheng Li, Xuelong Li

The adapter incorporates prior spatial knowledge from the source modality to guide the local feature aggregation of 3D tokens, compelling the semantic adaption of any-modality transformers.

Cloud-Device Collaborative Learning for Multimodal Large Language Models

no code implementations CVPR 2024 Guanqun Wang, Jiaming Liu, Chenxuan Li, Junpeng Ma, Yuan Zhang, Xinyu Wei, Kevin Zhang, Maurice Chong, Ray Zhang, Yijiang Liu, Shanghang Zhang

However, the deployment of these large-scale MLLMs on client devices is hindered by their extensive model parameters, leading to a notable decline in generalization capabilities when these models are compressed for device deployment.

Device-Cloud Collaboration Knowledge Distillation +1

FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection

no code implementations22 Dec 2023 Dongmei Zhang, Chang Li, Ray Zhang, Shenghao Xie, Wei Xue, Xiaodong Xie, Shanghang Zhang

In this work, we propose FM-OV3D, a method of Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection, which improves the open-vocabulary localization and recognition abilities of 3D model by blending knowledge from multiple pre-trained foundation models, achieving true open-vocabulary without facing constraints from original 3D datasets.

3D Object Detection 3D Open-Vocabulary Object Detection +2

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

no code implementations21 Dec 2023 Senqiao Yang, Jiaming Liu, Ray Zhang, Mingjie Pan, Zoey Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Yandong Guo, Shanghang Zhang

In this paper, we introduce LiDAR-LLM, which takes raw LiDAR data as input and harnesses the remarkable reasoning capabilities of LLMs to gain a comprehensive understanding of outdoor 3D scenes.

Instruction Following Language Modelling +1

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

7 code implementations4 Oct 2023 Yiwen Tang, Ray Zhang, Zoey Guo, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

To this end, we introduce Point-PEFT, a novel framework for adapting point cloud pre-trained models with minimal learnable parameters.

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance

7 code implementations29 Mar 2023 Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.

3D visual grounding

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding

no code implementations ICCV 2023 Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.

3D visual grounding

Stain-free Detection of Embryo Polarization using Deep Learning

no code implementations8 Nov 2021 Cheng Shen, Adiyant Lamba, Meng Zhu, Ray Zhang, Changhuei Yang, Magdalena Zernicka Goetz

In conclusion, we describe a method for detecting a key developmental feature of embryo development that avoids clinically impermissible fluorescence staining.

Ensemble Learning Self-Learning

A New Framework for Registration of Semantic Point Clouds from Stereo and RGB-D Cameras

1 code implementation10 Nov 2020 Ray Zhang, Tzu-Yuan Lin, Chien Erh Lin, Steven A. Parkison, William Clark, Jessy W. Grizzle, Ryan M. Eustice, Maani Ghaffari

This paper reports on a novel nonparametric rigid point cloud registration framework that jointly integrates geometric and semantic measurements such as color or semantic labels into the alignment process and does not require explicit data association.

Point Cloud Registration

Bayesian Spatial Kernel Smoothing for ScalableDense Semantic Mapping

2 code implementations10 Sep 2019 Lu Gan, Ray Zhang, Jessy W. Grizzle, Ryan M. Eustice, Maani Ghaffari

This paper develops a Bayesian continuous 3D semantic occupancy map from noisy point cloud measurements.

Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.