Search Results for author: Zaiwei Zhang

Found 17 papers, 9 papers with code

Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality

no code implementations21 Dec 2024 Liyan Chen, Gregory P. Meyer, Zaiwei Zhang, Eric M. Wolff, Paul Vernaza

Recent efforts recognize the power of scale in 3D learning (e. g. PTv3) and attention mechanisms (e. g. FlashAttention).

Uncertainty-Guided Enhancement on Driving Perception System via Foundation Models

no code implementations2 Oct 2024 Yunhao Yang, Yuxin Hu, Mao Ye, Zaiwei Zhang, Zhichao Lu, Yi Xu, Ufuk Topcu, Ben Snyder

Multimodal foundation models offer promising advancements for enhancing driving perception systems, but their high computational and financial costs pose challenges.

Conformal Prediction

VLMine: Long-Tail Data Mining with Vision Language Models

no code implementations23 Sep 2024 Mao Ye, Gregory P. Meyer, Zaiwei Zhang, Dennis Park, Siva Karthik Mustikovela, Yuning Chai, Eric M Wolff

We propose a simple and scalable data mining approach that leverages the knowledge contained within a large vision language model (VLM).

3D Object Detection Autonomous Driving +3

VLM-KD: Knowledge Distillation from VLM for Long-Tail Visual Recognition

no code implementations29 Aug 2024 Zaiwei Zhang, Gregory P. Meyer, Zhichao Lu, Ashish Shrivastava, Avinash Ravichandran, Eric M. Wolff

To our knowledge, this work is the first to utilize knowledge distillation with text supervision generated by an off-the-shelf VLM and apply it to vanilla randomly initialized vision encoders.

Knowledge Distillation Language Modeling +1

The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models

no code implementations4 Oct 2023 Chenwei Wu, Li Erran Li, Stefano Ermon, Patrick Haffner, Rong Ge, Zaiwei Zhang

Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood.

LiDAR-Based 3D Object Detection via Hybrid 2D Semantic Scene Generation

1 code implementation4 Apr 2023 Haitao Yang, Zaiwei Zhang, Xiangru Huang, Min Bai, Chen Song, Bo Sun, Li Erran Li, QiXing Huang

Bird's-Eye View (BEV) features are popular intermediate scene representations shared by the 3D backbone and the detector head in LiDAR-based object detectors.

3D Object Detection object-detection +1

Implicit Surface Contrastive Clustering for LiDAR Point Clouds

no code implementations CVPR 2023 Zaiwei Zhang, Min Bai, Erran Li

The first task focuses on learning semantic information by sorting local groups of points in the scene into a globally consistent set of semantically meaningful clusters using contrastive learning.

3D Object Detection Clustering +5

FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction

1 code implementation CVPR 2022 Zhenpei Yang, Zhile Ren, Miguel Angel Bautista, Zaiwei Zhang, Qi Shan, QiXing Huang

In this paper, we present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses.

3D geometry Camera Pose Estimation +2

Scene Synthesis via Uncertainty-Driven Attribute Synchronization

1 code implementation ICCV 2021 Haitao Yang, Zaiwei Zhang, Siming Yan, Haibin Huang, Chongyang Ma, Yi Zheng, Chandrajit Bajaj, QiXing Huang

This task is challenging because 3D scenes exhibit diverse patterns, ranging from continuous ones, such as object sizes and the relative poses between pairs of shapes, to discrete patterns, such as occurrence and co-occurrence of objects with symmetrical relationships.


Self-Supervised Pretraining of 3D Features on any Point-Cloud

1 code implementation ICCV 2021 Zaiwei Zhang, Rohit Girdhar, Armand Joulin, Ishan Misra

Pretraining on large labeled datasets is a prerequisite to achieve good performance in many computer vision tasks like 2D object recognition, video classification etc.

Object object-detection +4

H3DNet: 3D Object Detection Using Hybrid Geometric Primitives

2 code implementations ECCV 2020 Zaiwei Zhang, Bo Sun, Haitao Yang, Qi-Xing Huang

We show how to convert the predicted geometric primitives into object proposals by defining a distance function between an object and the geometric primitives.

3D Object Detection Object +1

Joint Learning of Neural Networks via Iterative Reweighted Least Squares

1 code implementation16 May 2019 Zaiwei Zhang, Xiangru Huang, Qi-Xing Huang, Xiao Zhang, Yuan Li

We formulate this problem as joint learning of multiple copies of the same network architecture and enforce the network weights to be shared across these networks.

General Classification Image Classification +1

Path-Invariant Map Networks

1 code implementation CVPR 2019 Zaiwei Zhang, Zhenxiao Liang, Lemeng Wu, Xiaowei Zhou, Qi-Xing Huang

Optimizing a network of maps among a collection of objects/domains (or map synchronization) is a central problem across computer vision and many other relevant fields.

3D Semantic Segmentation Scene Segmentation +1

Deep Generative Modeling for Scene Synthesis via Hybrid Representations

no code implementations6 Aug 2018 Zaiwei Zhang, Zhenpei Yang, Chongyang Ma, Linjie Luo, Alexander Huth, Etienne Vouga, Qi-Xing Huang

We show a principled way to train this model by combining discriminator losses for both a 3D object arrangement representation and a 2D image-based representation.

Cannot find the paper you are looking for? You can Submit a new open access paper.