Search Results for author: Wang Zeng

Found 9 papers, 7 papers with code

NADER: Neural Architecture Design via Multi-Agent Collaboration

no code implementations26 Dec 2024 Zekang Yang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

In this paper, we introduce NADER (Neural Architecture Design via multi-agEnt collaboRation), a novel framework that formulates neural architecture design (NAD) as a LLM-based multi-agent collaboration problem.

Neural Architecture Search

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension

no code implementations4 Nov 2024 Jie Yang, Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Ruimao Zhang

To bridge this gap, we introduce the novel challenge of Semantic Keypoint Comprehension, which aims to comprehend keypoints across different task scenarios, including keypoint semantic understanding, visual prompt-based keypoint detection, and textual prompt-based keypoint detection.

Keypoint Detection Language Modeling +2

TCFormer: Visual Recognition via Token Clustering Transformer

1 code implementation16 Jul 2024 Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

Our dynamic tokens possess two crucial characteristics: (1) Representing image regions with similar semantic meanings using the same vision token, even if those regions are not adjacent, and (2) concentrating on regions with valuable details and represent them using fine tokens.

Clustering Image Classification +4

When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset

1 code implementation14 Jul 2024 Yi Zhang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

With multi-modal joint training, our model achieves state-of-the-art performance on a wide range of pedestrian detection benchmarks, surpassing leading models tailored for specific sensor modality.

3D Object Detection Multispectral Object Detection +1

AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

1 code implementation23 Feb 2024 Zekang Yang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

While traditional AutoML approaches have been successfully applied in several critical steps of model development (e. g. hyperparameter optimization), there lacks a AutoML system that automates the entire end-to-end model production workflow for computer vision.

Hyperparameter Optimization Keypoint Estimation

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

1 code implementation28 Aug 2023 Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian, Ping Luo, Ji Wu

Multi-Label Image Recognition (MLIR) is a challenging task that aims to predict multiple object labels in a single image while modeling the complex relationships between labels and image regions.

graph construction Multi-Label Classification +1

Pose for Everything: Towards Category-Agnostic Pose Estimation

1 code implementation21 Jul 2022 Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

In this paper, we introduce the task of Category-Agnostic Pose Estimation (CAPE), which aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.

Category-Agnostic Pose Estimation Pose Estimation

3D Human Mesh Regression with Dense Correspondence

3 code implementations CVPR 2020 Wang Zeng, Wanli Ouyang, Ping Luo, Wentao Liu, Xiaogang Wang

This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space (i. e. a 2D space used for texture mapping of 3D mesh).

3D Human Pose Estimation 3D Human Reconstruction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.