Search Results for author: Yanting Zhang

Found 9 papers, 5 papers with code

Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model

no code implementations • 6 Apr 2024 • Zhonghan Zhao, Ke Ma, Wenhao Chai, Xuan Wang, Kewei Chen, Dongxu Guo, Yanting Zhang, Hongwei Wang, Gaoang Wang

After distillation, embodied agents can complete complex, open-ended tasks without additional expert guidance, utilizing the performance and knowledge of a versatile MLM.

Knowledge Distillation

Paper
Add Code

Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation

no code implementations • 13 Mar 2024 • Zhonghan Zhao, Kewei Chen, Dongxu Guo, Wenhao Chai, Tian Ye, Yanting Zhang, Gaoang Wang

To assess organizational behavior, we design a series of navigation tasks in the Minecraft environment, which includes searching and exploring.

Navigate

Paper
Add Code

Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation

no code implementations • 8 Mar 2024 • Yu Han, Ziwei Long, Yanting Zhang, Jin Wu, Zhijun Fang, Rui Fan

Taking into account the practical applicability of our method in real-world robotics applications, we also propose a novel patch descriptor distillation strategy to further reduce the computational complexity of correspondence matching.

Image Classification Semantic Segmentation +1

Paper
Add Code

Towards Effective Multi-Moving-Camera Tracking: A New Dataset and Lightweight Link Model

1 code implementation • 18 Dec 2023 • Yanting Zhang, Shuanghong Wang, Qingxiang Wang, Cairong Yan, Rui Fan

Moreover, to alleviate the impact of the image style variations caused by different cameras, a color transfer module is effectively incorporated to extract cross-camera consistent appearance features for pedestrian association across moving cameras for ICT, resulting in a much improved MTMMC tracking system, which can constitute a step further towards coordinated mining of multiple moving cameras.

Autonomous Vehicles

Paper
Code

UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

no code implementations • 19 Aug 2023 • Meiqi Sun, Zhonghan Zhao, Wenhao Chai, Hanjun Luo, Shidong Cao, Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang

Our proposed model takes support images and labels as prompt guidance for a query image.

Few-Shot Learning Pose Estimation

Paper
Add Code

MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

1 code implementation • 31 Jul 2023 • Enxin Song, Wenhao Chai, Guanhong Wang, Yucheng Zhang, Haoyang Zhou, Feiyang Wu, Haozhe Chi, Xun Guo, Tian Ye, Yanting Zhang, Yan Lu, Jenq-Neng Hwang, Gaoang Wang

Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.

Ranked #1 on zero-shot long video global-mode question answering on MovieChat-1K

Video-based Generative Performance Benchmarking (Consistency) Video-based Generative Performance Benchmarking (Contextual Understanding) +10

372

Paper
Code

DiffFashion: Reference-based Fashion Design with Structure-aware Transfer by Diffusion Models

1 code implementation • 14 Feb 2023 • Shidong Cao, Wenhao Chai, Shengyu Hao, Yanting Zhang, Hangyue Chen, Gaoang Wang

We focus on a new fashion design task, where we aim to transfer a reference appearance image onto a clothing image while preserving the structure of the clothing image.

Denoising Style Transfer