Search Results for author: Xiangyang Zhu

Found 11 papers, 9 papers with code

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

1 code implementation29 Aug 2024 Ziyu Guo, Renrui Zhang, Xiangyang Zhu, Chengzhuo Tong, Peng Gao, Chunyuan Li, Pheng-Ann Heng

We introduce SAM2Point, a preliminary exploration adapting Segment Anything Model 2 (SAM 2) for zero-shot and promptable 3D segmentation.

Segmentation

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT

1 code implementation5 Jun 2024 Le Zhuo, Ruoyi Du, Han Xiao, Yangguang Li, Dongyang Liu, Rongjie Huang, Wenze Liu, Lirui Zhao, Fu-Yun Wang, Zhanyu Ma, Xu Luo, Zehan Wang, Kaipeng Zhang, Xiangyang Zhu, Si Liu, Xiangyu Yue, Dingning Liu, Wanli Ouyang, Ziwei Liu, Yu Qiao, Hongsheng Li, Peng Gao

Lumina-T2X is a nascent family of Flow-based Large Diffusion Transformers that establishes a unified framework for transforming noise into various modalities, such as images and videos, conditioned on text instructions.

Point Cloud Generation Text-to-Image Generation

Less is More: Towards Efficient Few-shot 3D Semantic Segmentation via Training-free Networks

1 code implementation24 Aug 2023 Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyu Guo, Jiaming Liu, Hao Dong, Peng Gao

However, the prior pre-training stage not only introduces excessive time overhead, but also incurs a significant domain gap on `unseen' classes.

3D Semantic Segmentation Few-shot 3D semantic segmentation +1

Efficient Multi-View Inverse Rendering Using a Hybrid Differentiable Rendering Method

no code implementations19 Aug 2023 Xiangyang Zhu, Yiling Pan, Bailin Deng, Bin Wang

In this paper, we introduce a novel hybrid differentiable rendering method to efficiently reconstruct the 3D geometry and reflectance of a scene from multi-view images captured by conventional hand-held cameras.

3D geometry Inverse Rendering

Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement

1 code implementation ICCV 2023 Xiangyang Zhu, Renrui Zhang, Bowei He, Aojun Zhou, Dong Wang, Bin Zhao, Peng Gao

The popularity of Contrastive Language-Image Pre-training (CLIP) has propelled its application to diverse downstream vision tasks.

All Computational Efficiency +1

PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning

2 code implementations ICCV 2023 Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyu Guo, Ziyao Zeng, Zipeng Qin, Shanghang Zhang, Peng Gao

In this paper, we first collaborate CLIP and GPT to be a unified 3D open-world learner, named as PointCLIP V2, which fully unleashes their potential for zero-shot 3D classification, segmentation, and detection.

3D Classification 3D Object Detection +11

Conditional Generative Models for Simulation of EMG During Naturalistic Movements

1 code implementation3 Nov 2022 Shihan Ma, Alexander Kenneth Clarke, Kostiantyn Maksymenko, Samuel Deslauriers-Gauthier, Xinjun Sheng, Xiangyang Zhu, Dario Farina

As a solution to this problem, we propose a transfer learning approach, in which a conditional generative model is trained to mimic the output of an advanced numerical model.

Data Augmentation Transfer Learning

Steerable Pyramid Transform Enables Robust Left Ventricle Quantification

2 code implementations20 Jan 2022 Xiangyang Zhu, Kede Ma, Wufeng Xue

Predicting cardiac indices has long been a focal point in the medical imaging community.

Learning to Navigate from Simulation via Spatial and Semantic Information Synthesis with Noise Model Embedding

no code implementations13 Oct 2019 Gang Chen, Hongzhe Yu, Wei Dong, Xinjun Sheng, Xiangyang Zhu, Han Ding

While training an end-to-end navigation network in the real world is usually of high cost, simulation provides a safe and cheap environment in this training stage.

Navigate

Cannot find the paper you are looking for? You can Submit a new open access paper.