Search Results for author: Guoqing Wang

Found 18 papers, 4 papers with code

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

no code implementations23 Apr 2024 Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem.

Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network

no code implementations22 Apr 2024 Qiwen Deng, Yangcen Liu, Wen Li, Guoqing Wang

Particularly, an SRM filter is utilized to extract high-frequency details, which are combined with spatial features as input to the BSD.

SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction

no code implementations15 Apr 2024 Pin Tang, Zhongdao Wang, Guoqing Wang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

Vision-based perception for autonomous driving requires an explicit modeling of a 3D space, where 2D latent representations are mapped and subsequent 3D operators are applied.

Autonomous Driving

Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting

2 code implementations10 Apr 2024 Hao Lu, Jiaqi Tang, Xinli Xu, Xu Cao, Yunpeng Zhang, Guoqing Wang, Dalong Du, Hao Chen, Yingcong Chen

Finally, for MC3D-Det joint training, the elaborate dataset merge strategy is designed to solve the problem of inconsistent camera numbers and camera parameters.

3D Object Detection Autonomous Driving +1

Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning

no code implementations15 Mar 2024 Meixuan Li, Tianyu Li, Guoqing Wang, Peng Wang, Yang Yang, Heng Tao Shen

Aligning these distributions between corresponding regions from different tasks imparts higher flexibility and capacity to capture intra-region structures, accommodating a broader range of tasks.

Depth Estimation Semantic Segmentation +1

Open-Vocabulary Calibration for Vision-Language Models

no code implementations7 Feb 2024 Shuoyuan Wang, Jindong Wang, Guoqing Wang, Bob Zhang, Kaiyang Zhou, Hongxin Wei

Vision-language models (VLMs) have emerged as formidable tools, showing their strong capability in handling various open-vocabulary tasks in image recognition, text-driven visual content generation, and visual chatbots, to name a few.

Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

no code implementations24 Oct 2023 Yinjie Lei, Zixuan Wang, Feng Chen, Guoqing Wang, Peng Wang, Yang Yang

Multi-modal 3D scene understanding has gained considerable attention due to its wide applications in many areas, such as autonomous driving and human-computer interaction.

Autonomous Driving Scene Understanding

Faster Video Moment Retrieval with Point-Level Supervision

no code implementations23 May 2023 Xun Jiang, Zailei Zhou, Xing Xu, Yang Yang, Guoqing Wang, Heng Tao Shen

Existing VMR methods suffer from two defects: (1) massive expensive temporal annotations are required to obtain satisfying performance; (2) complicated cross-modal interaction modules are deployed, which lead to high computational cost and low efficiency for the retrieval process.

Moment Retrieval Natural Language Queries +1

Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement

1 code implementation CVPR 2023 Yuhui Wu, Chen Pan, Guoqing Wang, Yang Yang, Jiwei Wei, Chongyi Li, Heng Tao Shen

To address this issue, we propose a novel semantic-aware knowledge-guided framework (SKF) that can assist a low-light enhancement model in learning rich and diverse priors encapsulated in a semantic segmentation model.

Low-Light Image Enhancement Semantic Segmentation

ScanERU: Interactive 3D Visual Grounding based on Embodied Reference Understanding

1 code implementation23 Mar 2023 Ziyang Lu, Yunqiang Pei, Guoqing Wang, Yang Yang, Zheng Wang, Heng Tao Shen

Despite their effectiveness, existing methods suffer from the difficulty of low recognition accuracy in cases of multiple adjacent objects with similar appearances. To address this issue, this work intuitively introduces the human-robot interaction as a cue to facilitate the development of 3D visual grounding.

Visual Grounding

Thunder: Thumbnail based Fast Lightweight Image Denoising Network

no code implementations24 May 2022 Yifeng Zhou, Xing Xu, Shuaicheng Liu, Guoqing Wang, Huimin Lu, Heng Tao Shen

To achieve promising results on removing noise from real-world images, most of existing denoising networks are formulated with complex network structure, making them impractical for deployment.

Image Denoising SSIM

Learning content and context with language bias for Visual Question Answering

1 code implementation21 Dec 2020 Chao Yang, Su Feng, Dongsheng Li, HuaWei Shen, Guoqing Wang, Bin Jiang

Many works concentrate on how to reduce language bias which makes models answer questions ignoring visual content and language context.

Question Answering Visual Question Answering

ERL-Net: Entangled Representation Learning for Single Image De-Raining

no code implementations ICCV 2019 Guoqing Wang, Changming Sun, Arcot Sowmya

In this paper, we hypothesize that there exists an inherent mapping between the low-quality embedding to a latent optimal one, with which the generator (decoder) can produce much better results.

Image Restoration Image-to-Image Translation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.