Search Results for author: Zehui Chen

Found 30 papers, 16 papers with code

Are We on the Right Way for Evaluating Large Vision-Language Models?

1 code implementation • 29 Mar 2024 • Lin Chen, Jinsong Li, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao

We evaluate 16 leading LVLMs on MMStar to assess their multi-modal capabilities, and on 7 benchmarks with the proposed metrics to investigate their data leakage and actual multi-modal gain.

World Knowledge

Paper
Code

InternLM2 Technical Report

1 code implementation • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)

4k Long-Context Understanding

5,218

Paper
Code

PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition

1 code implementation • 26 Mar 2024 • Chenhongyi Yang, Zehui Chen, Miguel Espinosa, Linus Ericsson, Zhenyu Wang, Jiaming Liu, Elliot J. Crowley

In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information.

Image Classification Instance Segmentation +3

Paper
Code

Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection

no code implementations • 22 Mar 2024 • Hongzhi Gao, Zheng Chen, Zehui Chen, Lin Chen, Jiaming Liu, Shanghang Zhang, Feng Zhao

Training high-accuracy 3D detectors necessitates massive labeled 3D annotations with 7 degree-of-freedom, which is laborious and time-consuming.

3D Object Detection object-detection +2

Paper
Add Code

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

1 code implementation • 19 Mar 2024 • Zehui Chen, Kuikun Liu, Qiuchen Wang, Wenwei Zhang, Jiangning Liu, Dahua Lin, Kai Chen, Feng Zhao

Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents.

Hallucination

199

Paper
Code

A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track

no code implementations • 27 Feb 2024 • Zehui Chen, Qiuchen Wang, Zhenyu Li, Jiaming Liu, Shanghang Zhang, Feng Zhao

In this report, we present our solution to the multi-task robustness track of the 1st Visual Continual Learning (VCL) Challenge at ICCV 2023 Workshop.

3D Object Detection Continual Learning +5

Paper
Add Code

Stream Query Denoising for Vectorized HD Map Construction

no code implementations • 17 Jan 2024 • Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao

This paper introduces the Stream Query Denoising (SQD) strategy as a novel approach for temporal modeling in high-definition map (HD-map) construction.

Autonomous Driving Denoising

Paper
Add Code

T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

1 code implementation • 21 Dec 2023 • Zehui Chen, Weihua Du, Wenwei Zhang, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Zhuo, Songyang Zhang, Dahua Lin, Kai Chen, Feng Zhao

Based on that, we further introduce T-Eval to evaluate the tool utilization capability step by step.

Instruction Following Retrieval

158

Paper
Code

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

no code implementations • 21 Dec 2023 • Senqiao Yang, Jiaming Liu, Ray Zhang, Mingjie Pan, Zoey Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Yandong Guo, Shanghang Zhang

In this paper, we introduce LiDAR-LLM, which takes raw LiDAR data as input and harnesses the remarkable reasoning capabilities of LLMs to gain a comprehensive understanding of outdoor 3D scenes.

Instruction Following Language Modelling +1

Paper
Add Code

Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation

no code implementations • 19 Dec 2023 • Jiaming Liu, ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang

To tackle these issues, we propose a continual self-supervised method, Adaptive Distribution Masked Autoencoders (ADMA), which enhances the extraction of target domain knowledge while mitigating the accumulation of distribution shifts.

Decoder Self-Supervised Learning +1

Paper
Add Code

Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation

no code implementations • 24 Sep 2023 • Jiayi Ni, Senqiao Yang, ran Xu, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, Zehui Chen, Yi Liu, Shanghang Zhang

In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications.

Autonomous Driving Semantic Segmentation +1

Paper
Add Code

Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction

1 code implementation • 17 Mar 2023 • Senqiao Yang, Jiarui Wu, Jiaming Liu, Xiaoqi Li, Qizhe Zhang, Mingjie Pan, Yulu Gan, Zehui Chen, Shanghang Zhang

The visual prompts have provided an efficient manner in addressing visual cross-domain problems.

Depth Estimation Semantic Segmentation +1

Paper
Code

Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View

no code implementations • CVPR 2023 • Shuo Wang, Xinhai Zhao, Hai-Ming Xu, Zehui Chen, Dameng Yu, Jiahao Chang, Zhen Yang, Feng Zhao

Based on the covariate shift assumption, we find that the gap mainly attributes to the feature distribution of BEV, which is determined by the quality of both depth estimation and 2D image's feature representation.

3D Object Detection Depth Estimation +3

Paper
Add Code

Learning from Noisy Data for Semi-Supervised 3D Object Detection

1 code implementation • ICCV 2023 • Zehui Chen, Zhenyu Li, Shuo Wang, Dengpan Fu, Feng Zhao

To this end, we propose NoiseDet, a simple yet effective framework for semi-supervised 3D object detection.

3D Object Detection object-detection +2

Paper
Code

BEVUDA: Multi-geometric Space Alignments for Domain Adaptive BEV 3D Object Detection

no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang

In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.

3D Object Detection Autonomous Driving +4

Paper
Add Code

BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection

1 code implementation • 17 Nov 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

Instead of directly training a depth prediction network, we unify the image and LiDAR features in the Bird-Eye-View (BEV) space and adaptively transfer knowledge across non-homogenous representations in a teacher-student paradigm.

Ranked #14 on 3D Object Detection on nuScenes Camera Only

3D Object Detection Depth Estimation +4

100

Paper
Code

DETRDistill: A Universal Knowledge Distillation Framework for DETR-families

no code implementations • ICCV 2023 • Jiahao Chang, Shuo Wang, HaiMing Xu, Zehui Chen, Chenhongyi Yang, Feng Zhao

Next, we propose a target-aware feature distillation to help the student model learn from the object-centric features of the teacher model.

Knowledge Distillation object-detection +1

Paper
Add Code

Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report

no code implementations • 7 Nov 2022 • Andrey Ignatov, Grigory Malivenko, Radu Timofte, Lukasz Treszczotko, Xin Chang, Piotr Ksiazek, Michal Lopuszynski, Maciej Pioro, Rafal Rudnicki, Maciej Smyl, Yujie Ma, Zhenyu Li, Zehui Chen, Jialei Xu, Xianming Liu, Junjun Jiang, XueChao Shi, Difan Xu, Yanan Li, Xiaotao Wang, Lei Lei, Ziyu Zhang, Yicheng Wang, Zilong Huang, Guozhong Luo, Gang Yu, Bin Fu, Jiaqi Li, Yiran Wang, Zihao Huang, Zhiguo Cao, Marcos V. Conde, Denis Sapozhnikov, Byeong Hyun Lee, Dongwon Park, Seongmin Hong, Joonhee Lee, Seunggyu Lee, Se Young Chun

Various depth estimation models are now widely used on many mobile and IoT devices for image segmentation, bokeh effect rendering, object tracking and many other mobile tasks.

Bokeh Effect Rendering Depth Estimation +3

Paper
Add Code

LiteDepth: Digging into Fast and Accurate Depth Estimation on Mobile Devices

1 code implementation • 2 Sep 2022 • Zhenyu Li, Zehui Chen, Jialei Xu, Xianming Liu, Junjun Jiang

Notably, our solution named LiteDepth ranks 2nd in the MAI&AIM2022 Monocular Depth Estimation Challenge}, with a si-RMSE of 0. 311, an RMSE of 3. 79, and the inference time is 37$ms$ tested on the Raspberry Pi 4.

Data Augmentation Monocular Depth Estimation

Paper
Code

AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection

1 code implementation • 21 Jul 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

Recently, AutoAlign presents a learnable paradigm in combining these two modalities for 3D object detection.

3D Object Detection Autonomous Driving +1

138

Paper
Code

Towards Model Generalization for Monocular 3D Object Detection

no code implementations • 23 May 2022 • Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang

However, caused by severe domain gaps (e. g., the field of view (FOV), pixel size, and object size among datasets), Mono3D detectors have difficulty in generalization, leading to drastic performance degradation on unseen domains.

Autonomous Driving Monocular 3D Object Detection +3

Paper
Add Code

Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection

no code implementations • 25 Apr 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao

3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding.

3D Object Detection Graph structure learning +3

Paper
Add Code

Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training

1 code implementation • 25 Apr 2022 • Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang

Based on this, we develop a teacher-student paradigm to generate adaptive pseudo labels on the target domain.

Autonomous Driving Monocular 3D Object Detection +2

Paper
Code

DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation

1 code implementation • 27 Mar 2022 • Zhenyu Li, Zehui Chen, Xianming Liu, Junjun Jiang

This paper aims to address the problem of supervised monocular depth estimation.

Ranked #20 on Monocular Depth Estimation on KITTI Eigen split (using extra training data)

Inductive Bias Monocular Depth Estimation

871

Paper
Code

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

no code implementations • 17 Jan 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinghong Jiang, Feng Zhao, Bolei Zhou, Hang Zhao

This map enables our model to automate the alignment of non-homogenous features in a dynamic and data-driven manner.

3D Object Detection Autonomous Driving +1

Paper
Add Code

SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

1 code implementation • 9 Dec 2021 • Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang, Bolei Zhou, Hang Zhao

To bridge this gap, we aim to learn a spatial-aware visual representation that can describe the three-dimensional space and is more suitable and effective for these tasks.

Contrastive Learning Unsupervised Pre-training

Paper
Code

Disentangle Your Dense Object Detector

2 code implementations • 7 Jul 2021 • Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, Feng Wu

Extensive experiments on MS COCO benchmark show that our approach can lead to 2. 0 mAP, 2. 4 mAP and 2. 2 mAP absolute improvements on RetinaNet, FCOS, and ATSS baselines with negligible extra overhead.

Disentanglement Object +2

27,857

Paper
Code

Channel Models and Coding Solutions for 1S1R Crossbar Resistive Memory with High Line Resistance

no code implementations • 28 Apr 2021 • Zehui Chen, Lara Dolecek

Simulations based on these models suggest that the bit-error rate of devices are highly non-uniform across the memory array.

Paper
Add Code

Towards Fine-grained Large Object Segmentation 1st Place Solution to 3D AI Challenge 2020 -- Instance Segmentation Track

1 code implementation • 10 Sep 2020 • Zehui Chen, Qiaofei Li, Feng Zhao

This technical report introduces our solutions of Team 'FineGrainedSeg' for Instance Segmentation track in 3D AI Challenge 2020.

Instance Segmentation Semantic Segmentation

Paper
Code

1st Place Solutions of Waymo Open Dataset Challenge 2020 -- 2D Object Detection Track

1 code implementation • 4 Aug 2020 • Zehao Huang, Zehui Chen, Qiaofei Li, Hongkai Zhang, Naiyan Wang

In this technical report, we present our solutions of Waymo Open Dataset (WOD) Challenge 2020 - 2D Object Track.

Object object-detection +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.