Search Results for author: Feng Zhao

Found 66 papers, 22 papers with code

Disentangle Your Dense Object Detector

2 code implementations • 7 Jul 2021 • Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, Feng Wu

Extensive experiments on MS COCO benchmark show that our approach can lead to 2. 0 mAP, 2. 4 mAP and 2. 2 mAP absolute improvements on RetinaNet, FCOS, and ATSS baselines with negligible extra overhead.

Disentanglement Object +2

27,765

Paper
Code

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

1 code implementation • 21 Nov 2023 • Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin

In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet often constrained by the scarcity of high-quality image-text data.

Ranked #1 on visual instruction following on LLaVA-Bench

Descriptive visual instruction following +2

1,622

Paper
Code

P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds

2 code implementations • CVPR 2020 • Haozhe Qi, Chen Feng, Zhiguo Cao, Feng Zhao, Yang Xiao

Specifically, we first sample seeds from the point clouds in template and search area respectively.

3D Object Tracking Object Tracking

232

Paper
Code

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

1 code implementation • 19 Mar 2024 • Zehui Chen, Kuikun Liu, Qiuchen Wang, Wenwei Zhang, Jiangning Liu, Dahua Lin, Kai Chen, Feng Zhao

Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents.

Hallucination

194

Paper
Code

T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

1 code implementation • 21 Dec 2023 • Zehui Chen, Weihua Du, Wenwei Zhang, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Zhuo, Songyang Zhang, Dahua Lin, Kai Chen, Feng Zhao

Based on that, we further introduce T-Eval to evaluate the tool utilization capability step by step.

Instruction Following Retrieval

150

Paper
Code

AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection

1 code implementation • 21 Jul 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

Recently, AutoAlign presents a learnable paradigm in combining these two modalities for 3D object detection.

3D Object Detection Autonomous Driving +1

138

Paper
Code

BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection

1 code implementation • 17 Nov 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

Instead of directly training a depth prediction network, we unify the image and LiDAR features in the Bird-Eye-View (BEV) space and adaptively transfer knowledge across non-homogenous representations in a teacher-student paradigm.

Ranked #14 on 3D Object Detection on nuScenes Camera Only

3D Object Detection Depth Estimation +4

Paper
Code

Are We on the Right Way for Evaluating Large Vision-Language Models?

1 code implementation • 29 Mar 2024 • Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao

We evaluate 16 leading LVLMs on MMStar to assess their multi-modal capabilities, and on 7 benchmarks with the proposed metrics to investigate their data leakage and actual multi-modal gain.

World Knowledge

Paper
Code

MMNet: Muscle motion-guided network for micro-expression recognition

1 code implementation • 14 Jan 2022 • Hanting Li, Mingzhe Sui, Zhaoqing Zhu, Feng Zhao

By adding the position embeddings of the face generated by PC module at the end of the two branches, the PC module can help to add position information to facial muscle motion pattern features for the MER.

Micro Expression Recognition Micro-Expression Recognition +1

Paper
Code

Towards Fine-grained Large Object Segmentation 1st Place Solution to 3D AI Challenge 2020 -- Instance Segmentation Track

1 code implementation • 10 Sep 2020 • Zehui Chen, Qiaofei Li, Feng Zhao

This technical report introduces our solutions of Team 'FineGrainedSeg' for Instance Segmentation track in 3D AI Challenge 2020.

Instance Segmentation Semantic Segmentation

Paper
Code

Bijective Mapping Network for Shadow Removal

2 code implementations • CVPR 2022 • Yurui Zhu, Jie Huang, Xueyang Fu, Feng Zhao, Qibin Sun, Zheng-Jun Zha

Shadow removal, which aims to restore the background in the shadow regions, is challenging due to the highly ill-posed nature.

Shadow Removal

Paper
Code

Ultra-High Resolution Segmentation with Ultra-Rich Context: A Novel Benchmark

1 code implementation • CVPR 2023 • Deyi Ji, Feng Zhao, Hongtao Lu, Mingyuan Tao, Jieping Ye

With the increasing interest and rapid development of methods for Ultra-High Resolution (UHR) segmentation, a large-scale benchmark covering a wide range of scenes with full fine-grained dense annotations is urgently needed to facilitate the field.

Ranked #1 on Semantic Segmentation on INRIA Aerial Image Labeling (mIOU metric)

Land Cover Classification Semantic Segmentation

Paper
Code

Domain-Unified Prompt Representations for Source-Free Domain Generalization

1 code implementation • 29 Sep 2022 • Hongjing Niu, Hanting Li, Feng Zhao, Bin Li

The proposed scheme generates diverse prompts from a domain bank that contains many more diverse domains than existing DG datasets.

Source-free Domain Generalization

Paper
Code

Empowering Low-Light Image Enhancer through Customized Learnable Priors

1 code implementation • ICCV 2023 • Naishan Zheng, Man Zhou, Yanmeng Dong, Xiangyu Rui, Jie Huang, Chongyi Li, Feng Zhao

In this work, we propose a paradigm for low-light image enhancement that explores the potential of customized learnable priors to improve the transparency of the deep unfolding paradigm.

Low-Light Image Enhancement

Paper
Code

Learning from Noisy Data for Semi-Supervised 3D Object Detection

1 code implementation • ICCV 2023 • Zehui Chen, Zhenyu Li, Shuo Wang, Dengpan Fu, Feng Zhao

To this end, we propose NoiseDet, a simple yet effective framework for semi-supervised 3D object detection.

3D Object Detection object-detection +2

Paper
Code

Intensity-Aware Loss for Dynamic Facial Expression Recognition in the Wild

1 code implementation • 19 Aug 2022 • Hanting Li, Hongjing Niu, Zhaoqing Zhu, Feng Zhao

One of the main reasons is that video sequences often contain frames with different expression intensities, especially for the facial expressions in the real-world scenarios, while the images in SFER frequently present uniform and high expression intensities.

Ranked #8 on Dynamic Facial Expression Recognition on FERV39k

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Code

PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety

1 code implementation • 22 Jan 2024 • Zaibin Zhang, Yongting Zhang, Lijun Li, Hongzhi Gao, Lijun Wang, Huchuan Lu, Feng Zhao, Yu Qiao, Jing Shao

In this paper, we explore these concerns through the innovative lens of agent psychology, revealing that the dark psychological states of agents constitute a significant threat to safety.

Paper
Code

Ingredient-Oriented Multi-Degradation Learning for Image Restoration

1 code implementation • CVPR 2023 • Jinghao Zhang, Jie Huang, Mingde Yao, Zizheng Yang, Hu Yu, Man Zhou, Feng Zhao

Learning to leverage the relationship among diverse image restoration tasks is quite beneficial for unraveling the intrinsic ingredients behind the degradation.

Image Restoration

Paper
Code

Unleashing the Potential of Unsupervised Pre-Training with Intra-Identity Regularization for Person Re-Identification

1 code implementation • 1 Dec 2021 • Zizheng Yang, Xin Jin, Kecheng Zheng, Feng Zhao

During the pre-training, we attempt to address two critical issues for learning fine-grained ReID features: (1) the augmentations in CL pipeline may distort the discriminative clues in person images.

Contrastive Learning Person Re-Identification +2

Paper
Code

High-quality Image Dehazing with Diffusion Model

1 code implementation • 23 Aug 2023 • Hu Yu, Jie Huang, Kaiwen Zheng, Feng Zhao

The latter stage exploits the strong generation ability of DDPM to compensate for the haze-induced huge information loss, by working in conjunction with the physical modelling.

Denoising Image Dehazing

Paper
Code

Deep Fourier Up-Sampling

1 code implementation • 11 Oct 2022 • Man Zhou, Hu Yu, Jie Huang, Feng Zhao, Jinwei Gu, Chen Change Loy, Deyu Meng, Chongyi Li

Existing convolutional neural networks widely adopt spatial down-/up-sampling for multi-scale modeling.

Image Dehazing Image Segmentation +4

Paper
Code

Selective Noise Suppression Methods Using Random SVPWM to Shape the Noise Spectrum of PMSMs

1 code implementation • 16 Feb 2023 • Jian Wen, Xiaobin Cheng, Peifeng Ji, Jun Yang, Feng Zhao

Both the pulse position and switching frequency are randomized in the second method.

Position

Paper
Code

Structural Learning for Template-free Protein Folding

no code implementations • 6 Nov 2013 • Feng Zhao

The thesis is aimed to solve the template-free protein folding problem by tackling two important components: efficient sampling in vast conformation space, and design of knowledge-based potentials with high accuracy.

Protein Folding

Paper
Add Code

MVT: Mask Vision Transformer for Facial Expression Recognition in the wild

no code implementations • 8 Jun 2021 • Hanting Li, Mingzhe Sui, Feng Zhao, ZhengJun Zha, Feng Wu

Facial Expression Recognition (FER) in the wild is an extremely challenging task in computer vision due to variant backgrounds, low-quality facial images, and the subjectiveness of annotators.

Facial Expression Recognition Facial Expression Recognition (FER)

Paper
Add Code

MFEViT: A Robust Lightweight Transformer-based Network for Multimodal 2D+3D Facial Expression Recognition

no code implementations • 20 Sep 2021 • Hanting Li, Mingzhe Sui, Zhaoqing Zhu, Feng Zhao

To the best of our knowledge, this is the first work to introduce vision transformer into multimodal 2D+3D FER.

3D Facial Expression Recognition Facial Expression Recognition

Paper
Add Code

Performance-Guaranteed ODE Solvers with Complexity-Informed Neural Networks

no code implementations • NeurIPS Workshop DLDE 2021 • Feng Zhao, Xiang Chen, Jun Wang, Zuoqiang Shi, Shao-Lun Huang

Traditionally, we provide technical parameters for ODE solvers, such as the order, the stepsize and the local error threshold.

Paper
Add Code

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

no code implementations • 17 Jan 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinghong Jiang, Feng Zhao, Bolei Zhou, Hang Zhao

This map enables our model to automate the alignment of non-homogenous features in a dynamic and data-driven manner.

3D Object Detection Autonomous Driving +1

Paper
Add Code

Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth

no code implementations • 3 Feb 2022 • Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Wu, Feng Zhao

However, in some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.

3D Scene Reconstruction Depth Completion +1

Paper
Add Code

Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection

no code implementations • 25 Apr 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding.

3D Object Detection Graph structure learning +3

Paper
Add Code

AFNet-M: Adaptive Fusion Network with Masks for 2D+3D Facial Expression Recognition

no code implementations • 24 May 2022 • Mingzhe Sui, Hanting Li, Zhaoqing Zhu, Feng Zhao

2D+3D facial expression recognition (FER) can effectively cope with illumination changes and pose variations by simultaneously merging 2D texture and more robust 3D depth information.

3D Facial Expression Recognition Facial Expression Recognition

Paper
Add Code

Unleashing Potential of Unsupervised Pre-Training With Intra-Identity Regularization for Person Re-Identification

no code implementations • CVPR 2022 • Zizheng Yang, Xin Jin, Kecheng Zheng, Feng Zhao

During the pre-training, we attempt to address two critical issues for learning fine-grained ReID features: (1) the augmentations in CL pipeline may distort the discriminative clues in person images.

Contrastive Learning Person Re-Identification +2

Paper
Add Code

Mutual Information-Driven Pan-Sharpening

no code implementations • CVPR 2022 • Man Zhou, Keyu Yan, Jie Huang, Zihe Yang, Xueyang Fu, Feng Zhao

Despite the remarkable progress, existing state-of-the-art Pan-sharpening methods don't explicitly enforce the complementary information learning between two modalities of PAN and MS images.

Paper
Add Code

Exposure Normalization and Compensation for Multiple-Exposure Correction

no code implementations • CVPR 2022 • Jie Huang, Yajing Liu, Xueyang Fu, Man Zhou, Yang Wang, Feng Zhao, Zhiwei Xiong

However, the procedures of correcting underexposure and overexposure to normal exposures are much different from each other, leading to large discrepancies for the network in correcting multiple exposures, thus resulting in poor performance.

Image Enhancement

Paper
Add Code

Underdetermined 2D-DOD and 2D-DOA Estimation for Bistatic Coprime EMVS-MIMO Radar: From the Difference Coarray Perspective

no code implementations • 6 Jun 2022 • Qianpeng Xie, Yihang Du, He Wang, Xiaoyi Pan, Feng Zhao

Firstly, a 5-D tensor model was constructed by using the multi-dimensional space-time characteristics of the received data.

Paper
Add Code

8D Parameters Estimation for Bistatic EMVS-MIMO Radar via the nested PARAFAC

no code implementations • 4 Jun 2022 • Qianpeng Xie, He Wang, Yihang Du, Xiaoyi Pan, Feng Zhao

Firstly, the outer part PARAFAC algorithm was carried out to estimate the receive spatial response matrix and its first way factor matrix.

Paper
Add Code

NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition

no code implementations • 10 Jun 2022 • Hanting Li, Mingzhe Sui, Zhaoqing Zhu, Feng Zhao

Dynamic facial expression recognition (DFER) in the wild is an extremely challenging task, due to a large number of noisy frames in the video sequences.

Ranked #9 on Dynamic Facial Expression Recognition on FERV39k

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Add Code

Source-Free Domain Adaptation for Real-world Image Dehazing

no code implementations • 14 Jul 2022 • Hu Yu, Jie Huang, Yajing Liu, Qi Zhu, Man Zhou, Feng Zhao

Although certain Domain Adaptation (DA) dehazing methods have been presented, they inevitably require access to the source dataset to reduce the gap between the source synthetic and target real domains.

Image Dehazing Source-Free Domain Adaptation +1

Paper
Add Code

Enhancement by Your Aesthetic: An Intelligible Unsupervised Personalized Enhancer for Low-Light Images

no code implementations • 15 Jul 2022 • Naishan Zheng, Jie Huang, Qi Zhu, Man Zhou, Feng Zhao, Zheng-Jun Zha

Low-light image enhancement is an inherently subjective process whose targets vary with the user's aesthetic.

Low-Light Image Enhancement

Paper
Add Code

CNSNet: A Cleanness-Navigated-Shadow Network for Shadow Removal

no code implementations • 6 Sep 2022 • Qianhao Yu, Naishan Zheng, Jie Huang, Feng Zhao

The key to shadow removal is recovering the contents of the shadow regions with the guidance of the non-shadow regions.

Long-range modeling Shadow Removal

Paper
Add Code

KSG: Knowledge and Skill Graph

no code implementations • 13 Sep 2022 • Feng Zhao, Ziqi Zhang, Donglin Wang

This is the first study that we are aware of that looks into dynamic KSG for skill retrieval and learning.

Attribute Knowledge Graphs +2

Paper
Add Code

OpticE: A Coherence Theory-Based Model for Link Prediction

no code implementations • COLING 2022 • Xiangyu Gui, Feng Zhao, Langjunqing Jin, Hai Jin

During the learning process, the semantics of each entity are embedded by a vector or a point in a feature space.

Knowledge Graphs Link Prediction +3

Paper
Add Code

A similarity measurement for time series and its application to the stock market

no code implementations • Expert Systems with Applications 2021 • Feng Zhao, Yating Gao, Xinning Li, Zhiyong An, Shiyu Ge, Caiming Zhang

In this paper, for accurately describing the similarity between a pair of time series, a novel similarity measurement is proposed, which is named as the dynamic multi-perspective personalized similarity measurement (DMPSM).

Dynamic Time Warping Time Series +1

Paper
Add Code

Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network

no code implementations • 15 Oct 2022 • Keyu Yan, Man Zhou, Jie Huang, Feng Zhao, Chengjun Xie, Chongyi Li, Danfeng Hong

Panchromatic (PAN) and multi-spectral (MS) image fusion, named Pan-sharpening, refers to super-resolve the low-resolution (LR) multi-spectral (MS) images in the spatial domain to generate the expected high-resolution (HR) MS images, conditioning on the corresponding high-resolution PAN images.

Paper
Add Code

DETRDistill: A Universal Knowledge Distillation Framework for DETR-families

no code implementations • ICCV 2023 • Jiahao Chang, Shuo Wang, HaiMing Xu, Zehui Chen, Chenhongyi Yang, Feng Zhao

Next, we propose a target-aware feature distillation to help the student model learn from the object-centric features of the teacher model.

Knowledge Distillation object-detection +1

Paper
Add Code

CLIPER: A Unified Vision-Language Framework for In-the-Wild Facial Expression Recognition

no code implementations • 1 Mar 2023 • Hanting Li, Hongjing Niu, Zhaoqing Zhu, Feng Zhao

Facial expression recognition (FER) is an essential task for understanding human behaviors.

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Add Code

Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View

no code implementations • CVPR 2023 • Shuo Wang, Xinhai Zhao, Hai-Ming Xu, Zehui Chen, Dameng Yu, Jiahao Chang, Zhen Yang, Feng Zhao

Based on the covariate shift assumption, we find that the gap mainly attributes to the feature distribution of BEV, which is determined by the quality of both depth estimation and 2D image's feature representation.

3D Object Detection Depth Estimation +3

Paper
Add Code

Novel Quality Measure and Efficient Resolution of Convex Hull Pricing for Unit Commitment

no code implementations • 17 Apr 2023 • Mikhail A. Bragin, Farhan Hyder, Bing Yan, Peter B. Luh, Jinye Zhao, Feng Zhao, Dane A. Schiro, Tongxin Zheng

Several CH pricing methods have been presented, and a feasible cost has been used as a quality measure for the CH price.

Paper
Add Code

Visual Recognition-Driven Image Restoration for Multiple Degradation With Intrinsic Semantics Recovery

no code implementations • CVPR 2023 • Zizheng Yang, Jie Huang, Jiahao Chang, Man Zhou, Hu Yu, Jinghao Zhang, Feng Zhao

Deep image recognition models suffer a significant performance drop when applied to low-quality images since they are trained on high-quality images.

Domain Adaptation Image Enhancement +2

Paper
Add Code

Learning Sample Relationship for Exposure Correction

no code implementations • CVPR 2023 • Jie Huang, Feng Zhao, Man Zhou, Jie Xiao, Naishan Zheng, Kaiwen Zheng, Zhiwei Xiong

Exposure correction task aims to correct the underexposure and its adverse overexposure images to the normal exposure in a single network.

Task 2

Paper
Add Code

Cooperative IoT Data Sharing with Heterogeneity of Participants Based on Electricity Retail

no code implementations • 31 May 2023 • Bohong Wang, Qinglai Guo, Tian Xia, Qiang Li, Di Liu, Feng Zhao

With the development of Internet of Things (IoT) and big data technology, the data value is increasingly explored in multiple practical scenarios, including electricity transactions.

Data Valuation Fairness

Paper
Add Code

Guided Patch-Grouping Wavelet Transformer with Spatial Congruence for Ultra-High Resolution Segmentation

no code implementations • 3 Jul 2023 • Deyi Ji, Feng Zhao, Hongtao Lu

For the sake of high inference speed and low computation complexity, $\mathcal{T}$ partitions the original UHR image into patches and groups them dynamically, then learns the low-level local details with the lightweight multi-head Wavelet Transformer (WFormer) network.

Paper
Add Code

Coverage Enhancement Strategy in WMSNs Based on a Novel Swarm Intelligence Algorithm: Army Ant Search Optimizer

no code implementations • 3 Jul 2023 • Yindi Yao, Qin Wen, Yanpeng Cui, Feng Zhao, Bozhan Zhao, Yaoping Zeng

As one of the most crucial scenarios of the Internet of Things (IoT), wireless multimedia sensor networks (WMSNs) pay more attention to the information-intensive data (e. g., audio, video, image) for remote environments.

Paper
Add Code

Decomposition Ascribed Synergistic Learning for Unified Image Restoration

no code implementations • 1 Aug 2023 • Jinghao Zhang, Feng Zhao

Learning to restore multiple image degradations within a single model is quite beneficial for real-world applications.

Deblurring Image Deblurring +5

Paper
Add Code

FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models

no code implementations • ICCV 2023 • Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Zhao

3D scene reconstruction is a long-standing vision task.

3D Scene Reconstruction Monocular Depth Estimation

Paper
Add Code

Exploring Temporal Frequency Spectrum in Deep Video Deblurring

no code implementations • ICCV 2023 • Qi Zhu, Man Zhou, Naishan Zheng, Chongyi Li, Jie Huang, Feng Zhao

Video deblurring aims to restore the latent video frames from their blurred counterparts.

Deblurring

Paper
Add Code

Debias the Training of Diffusion Models

no code implementations • 12 Oct 2023 • Hu Yu, Li Shen, Jie Huang, Man Zhou, Hongsheng Li, Feng Zhao

Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss.

Denoising

Paper
Add Code

RSG: Fast Learning Adaptive Skills for Quadruped Robots by Skill Graph

no code implementations • 10 Nov 2023 • Hongyin Zhang, Diyuan Shi, Zifeng Zhuang, Han Zhao, Zhenyu Wei, Feng Zhao, Sibo Gai, Shangke Lyu, Donglin Wang

Developing robotic intelligent systems that can adapt quickly to unseen wild situations is one of the critical challenges in pursuing autonomous robotics.

Implicit Relations

Paper
Add Code

ChangeNet: Multi-Temporal Asymmetric Change Detection Dataset

no code implementations • 29 Dec 2023 • Deyi Ji, Siqi Gao, Mingyuan Tao, Hongtao Lu, Feng Zhao

The ChangeNet dataset is suitable for both binary change detection (BCD) and semantic change detection (SCD) tasks.

Change Detection

Paper
Add Code

Coverage Control Algorithm for DSNs Based on Improved Gravitational Search

no code implementations • IEEE Sensors Journal 2022 • Yindi Yao, Huanmin Liao, Xiong Li, Student Member, IEEE, Feng Zhao, Xuan Yang, and Shanshan Hu

—In directional sensor networks (DSNs), coverage control is an important way to ensure efficient communication and reliable data transmission.

Position

Paper
Add Code

Stream Query Denoising for Vectorized HD Map Construction

no code implementations • 17 Jan 2024 • Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao

This paper introduces the Stream Query Denoising (SQD) strategy as a novel approach for temporal modeling in high-definition map (HD-map) construction.

Autonomous Driving Denoising

Paper
Add Code

Prompt Learning on Temporal Interaction Graphs

no code implementations • 9 Feb 2024 • Xi Chen, Siwei Zhang, Yun Xiong, Xixi Wu, Jiawei Zhang, Xiangguo Sun, Yao Zhang, Feng Zhao, Yulin kang

In detail, we propose a temporal prompt generator to offer temporally-aware prompts for different tasks.

Representation Learning

Paper
Add Code

A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track

no code implementations • 27 Feb 2024 • Zehui Chen, Qiuchen Wang, Zhenyu Li, Jiaming Liu, Shanghang Zhang, Feng Zhao

In this report, we present our solution to the multi-task robustness track of the 1st Visual Continual Learning (VCL) Challenge at ICCV 2023 Workshop.

3D Object Detection Continual Learning +5

Paper
Add Code

View-Centric Multi-Object Tracking with Homographic Matching in Moving UAV

no code implementations • 16 Mar 2024 • Deyi Ji, Siqi Gao, Lanyun Zhu, Yiru Zhao, Peng Xu, Hongtao Lu, Feng Zhao

In this paper, we address the challenge of multi-object tracking (MOT) in moving Unmanned Aerial Vehicle (UAV) scenarios, where irregular flight trajectories, such as hovering, turning left/right, and moving up/down, lead to significantly greater complexity compared to fixed-camera MOT.

Homography Estimation Multi-Object Tracking +1

Paper
Add Code

Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection

no code implementations • 22 Mar 2024 • Hongzhi Gao, Zheng Chen, Zehui Chen, Lin Chen, Jiaming Liu, Shanghang Zhang, Feng Zhao

Training high-accuracy 3D detectors necessitates massive labeled 3D annotations with 7 degree-of-freedom, which is laborious and time-consuming.

3D Object Detection object-detection +2

Paper
Add Code

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

no code implementations • 28 Mar 2024 • BoWen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guo

To address the problem, we introduce GaussianCube, a structured GS representation that is both powerful and efficient for generative modeling.

Paper
Add Code

Uncovering the Text Embedding in Text-to-Image Diffusion Models

no code implementations • 1 Apr 2024 • Hu Yu, Hao Luo, Fan Wang, Feng Zhao

The correspondence between input text and the generated image exhibits opacity, wherein minor textual modifications can induce substantial deviations in the generated image.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.