Search Results for author: Qihang Zhang

Found 15 papers, 6 papers with code

3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

no code implementations28 May 2024 Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang

This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity.


Towards Text-guided 3D Scene Composition

no code implementations CVPR 2024 Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee

We marry the locality of objects with globality of scenes by introducing a hybrid 3D representation - explicit for objects and implicit for scenes.

Text to 3D

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

no code implementations CVPR 2024 Qihang Zhang, Yinghao Xu, Yujun Shen, Bo Dai, Bolei Zhou, Ceyuan Yang

Generating large-scale 3D scenes cannot simply apply existing 3D object synthesis technique since 3D scenes usually hold complex spatial configurations and consist of a number of objects at varying scales.

Scene Generation

Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance

no code implementations4 Sep 2023 Qisen Yang, Shenzhi Wang, Qihang Zhang, Gao Huang, Shiji Song

Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem.

Offline RL reinforcement-learning +1

GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

1 code implementation ICCV 2023 Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu, Hongsheng Li

In this paper, we propose Geometry Enhanced Masked Image Modeling (GeoMIM) to transfer the knowledge of the LiDAR model in a pretrain-finetune paradigm for improving the multi-view camera-based 3D detection.

3D Object Detection Decoder +2

Towards Smooth Video Composition

1 code implementation14 Dec 2022 Qihang Zhang, Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Video generation requires synthesizing consistent and persistent frames with dynamic content over time.

Image Generation single-image-generation +2

Noise-resilient approach for deep tomographic imaging

no code implementations22 Nov 2022 Zhen Guo, Zhiguang Liu, Qihang Zhang, George Barbastathis, Michael E. Glinsky

We propose a noise-resilient deep reconstruction algorithm for X-ray tomography.

Generative Category-Level Shape and Pose Estimation with Semantic Primitives

1 code implementation3 Oct 2022 Guanglin Li, Yifeng Li, Zhichao Ye, Qihang Zhang, Tao Kong, Zhaopeng Cui, Guofeng Zhang

Then, by using a SIM(3)-invariant shape descriptor, we gracefully decouple the shape and pose of an object, thus supporting latent shape optimization of target objects in arbitrary poses.

6D Pose Estimation using RGBD Object

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks

no code implementations12 May 2022 Xintian Wu, Qihang Zhang, Yiming Wu, Huanyu Wang, Songyuan Li, Lingyun Sun, Xi Li

Formulated as a conditional generation problem, face animation aims at synthesizing continuous face images from a single source image driven by a set of conditional face motion.

Extracting particle size distribution from laser speckle with a physics-enhanced autocorrelation-based estimator (PEACE)

no code implementations20 Apr 2022 Qihang Zhang, Janaka C. Gamekkanda, Ajinkya Pandit, Wenlong Tang, Charles Papageorgiou, Chris Mitchell, Yihui Yang, Michael Schwaerzler, Tolutola Oyetunde, Richard D. Braatz, Allan S. Myerson, George Barbastathis

Extracting quantitative information about highly scattering surfaces from an imaging system is challenging because the phase of the scattered light undergoes multiple folds upon propagation, resulting in complex speckle patterns.

BIG-bench Machine Learning

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

1 code implementation5 Apr 2022 Qihang Zhang, Zhenghao Peng, Bolei Zhou

Specifically, we train an inverse dynamic model with a small amount of labeled data and use it to predict action labels for all the YouTube video frames.

Autonomous Driving Imitation Learning

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

2 code implementations26 Sep 2021 Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, Bolei Zhou

Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic.

Benchmarking Decision Making +5

Improving the Generalization of End-to-End Driving through Procedural Generation

2 code implementations26 Dec 2020 Quanyi Li, Zhenghao Peng, Qihang Zhang, Chunxiao Liu, Bolei Zhou

We validate that training with the increasing number of procedurally generated scenes significantly improves the generalization of the agent across scenarios of different traffic densities and road networks.

Autonomous Driving

Cannot find the paper you are looking for? You can Submit a new open access paper.