Search Results for author: Wangbo Zhao

Found 15 papers, 12 papers with code

A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs

1 code implementation4 Dec 2024 Wangbo Zhao, Yizeng Han, Jiasheng Tang, Zhikai Li, Yibing Song, Kai Wang, Zhangyang Wang, Yang You

Vision-language models (VLMs) have shown remarkable success across various multi-modal tasks, yet large VLMs encounter significant efficiency challenges due to processing numerous visual tokens.

Visual Question Answering

Dynamic Diffusion Transformer

1 code implementation4 Oct 2024 Wangbo Zhao, Yizeng Han, Jiasheng Tang, Kai Wang, Yibing Song, Gao Huang, Fan Wang, Yang You

In addition, we design a Spatial-wise Dynamic Token (SDT) strategy to avoid redundant computation at unnecessary spatial locations.

Image Generation

Prioritize Alignment in Dataset Distillation

1 code implementation6 Aug 2024 Zekai Li, Ziyao Guo, Wangbo Zhao, Tianle Zhang, Zhi-Qi Cheng, Samir Khaki, Kaipeng Zhang, Ahmad Sajedi, Konstantinos N Plataniotis, Kai Wang, Yang You

To achieve this, existing methods use the agent model to extract information from the target dataset and embed it into the distilled dataset.

Dataset Distillation

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

1 code implementation6 May 2024 Zheng Zhu, XiaoFeng Wang, Wangbo Zhao, Chen Min, Nianchen Deng, Min Dou, Yuqi Wang, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang

General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems.

Autonomous Driving Decision Making +2

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

1 code implementation18 Mar 2024 Wangbo Zhao, Jiasheng Tang, Yizeng Han, Yibing Song, Kai Wang, Gao Huang, Fan Wang, Yang You

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

parameter-efficient fine-tuning Semantic Segmentation +1

MMBench: Is Your Multi-modal Model an All-around Player?

3 code implementations12 Jul 2023 YuAn Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin

In response to these challenges, we propose MMBench, a bilingual benchmark for assessing the multi-modal capabilities of VLMs.

Instruction Following Multiple-choice +1

Light Field Saliency Detection with Dual Local Graph Learning andReciprocative Guidance

1 code implementation2 Oct 2021 Nian Liu, Wangbo Zhao, Dingwen Zhang, Junwei Han, Ling Shao

On the other hand, instead of processing the twokinds of data separately, we build a novel dual graph modelto guide the focal stack fusion process using all-focus pat-terns.

Graph Learning Saliency Detection

Instance-Level Relative Saliency Ranking with Graph Reasoning

no code implementations8 Jul 2021 Nian Liu, Long Li, Wangbo Zhao, Junwei Han, Ling Shao

Conventional salient object detection models cannot differentiate the importance of different salient objects.

Image Retargeting object-detection +2

Weakly Supervised Video Salient Object Detection

1 code implementation CVPR 2021 Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, Junwei Han

Significant performance improvement has been achieved for fully-supervised video salient object detection with the pixel-wise labeled training datasets, which are time-consuming and expensive to obtain.

Object object-detection +4

Cannot find the paper you are looking for? You can Submit a new open access paper.