Search Results for author: Xumeng Han

Found 11 papers, 5 papers with code

AD^2-Bench: A Hierarchical CoT Benchmark for MLLM in Autonomous Driving under Adverse Conditions

no code implementations11 Jun 2025 Zhaoyang Wei, Chenhui Qiang, Bowen Jiang, Xumeng Han, Xuehui Yu, Zhenjun Han

Chain-of-Thought (CoT) reasoning has emerged as a powerful approach to enhance the structured, multi-step decision-making capabilities of Multi-Modal Large Models (MLLMs), is particularly crucial for autonomous driving with adverse weather conditions and complex traffic environments.

Autonomous Driving

Mixpert: Mitigating Multimodal Learning Conflicts with Efficient Mixture-of-Vision-Experts

no code implementations30 May 2025 Xin He, Xumeng Han, Longhui Wei, Lingxi Xie, Qi Tian

In this paper, we introduce Mixpert, an efficient mixture-of-vision-experts architecture that inherits the joint learning advantages from a single vision encoder while being restructured into a multi-expert paradigm for task-specific fine-tuning across different visual tasks.

Multi-Task Learning

P2Object: Single Point Supervised Object Detection and Instance Segmentation

1 code implementation10 Apr 2025 Pengfei Chen, Xuehui Yu, Xumeng Han, Kuiran Wang, Guorong Li, Lingxi Xie, Zhenjun Han, Jianbin Jiao

In this paper, we introduce Point-to-Box Network (P2BNet), which constructs balanced \textbf{\textit{instance-level proposal bags}} by generating proposals in an anchor-like way and refining the proposals in a coarse-to-fine paradigm.

Instance Segmentation Multiple Instance Learning +4

GaGA: Towards Interactive Global Geolocation Assistant

no code implementations12 Dec 2024 Zhiyang Dou, Zipeng Wang, Xumeng Han, Guorong Li, Zhipei Huang, Zhenjun Han

Global geolocation, which seeks to predict the geographical location of images captured anywhere in the world, is one of the most challenging tasks in the field of computer vision.

World Knowledge

ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts

no code implementations21 Oct 2024 Xumeng Han, Longhui Wei, Zhiyang Dou, Zipeng Wang, Chenhui Qiang, Xin He, Yingfei Sun, Zhenjun Han, Qi Tian

Mixture-of-Experts (MoE) models embody the divide-and-conquer concept and are a promising approach for increasing model capacity, demonstrating excellent scalability across multiple domains.

image-classification Image Classification +1

CPR++: Object Localization via Single Coarse Point Supervision

2 code implementations30 Jan 2024 Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiao

CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.

Object Object Localization

P2Seg: Pointly-supervised Segmentation via Mutual Distillation

no code implementations18 Jan 2024 Zipeng Wang, Xuehui Yu, Xumeng Han, Wenwen Yu, Zhixun Huang, Jianbin Jiao, Zhenjun Han

Nevertheless, weakly supervised semantic segmentation methods are proficient in utilizing intra-class feature consistency to capture the boundary contours of the same semantic regions.

Box-supervised Instance Segmentation Segmentation +2

Boosting Segment Anything Model Towards Open-Vocabulary Learning

1 code implementation6 Dec 2023 Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Zhenjun Han, Qi Tian

The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.

model Object +3

P2RBox: Point Prompt Oriented Object Detection with SAM

no code implementations22 Nov 2023 Guangming Cao, Xuehui Yu, Wenwen Yu, Xumeng Han, Xue Yang, Guorong Li, Jianbin Jiao, Zhenjun Han

In this study, we introduce P2RBox, which employs point prompt to generate rotated box (RBox) annotation for oriented object detection.

Object object-detection +2

Rethinking Sampling Strategies for Unsupervised Person Re-identification

2 code implementations7 Jul 2021 Xumeng Han, Xuehui Yu, Guorong Li, Jian Zhao, Gang Pan, Qixiang Ye, Jianbin Jiao, Zhenjun Han

While extensive research has focused on the framework design and loss function, this paper shows that sampling strategy plays an equally important role.

Pseudo Label Representation Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.