Search Results for author: Yueze Wang

Found 5 papers, 5 papers with code

Efficient Multimodal Learning from Data-centric Perspective

1 code implementation • 18 Feb 2024 • Muyang He, Yexin Liu, Boya Wu, Jianhao Yuan, Yueze Wang, Tiejun Huang, Bo Zhao

Multimodal Large Language Models (MLLMs) have demonstrated notable capabilities in general visual understanding and reasoning tasks.

674

Paper
Code

Universal Prompt Optimizer for Safe Text-to-Image Generation

1 code implementation • 16 Feb 2024 • Zongyu Wu, Hongcheng Gao, Yueze Wang, Xiang Zhang, Suhang Wang

To guide the optimizer to have the ability of converting toxic prompt to clean prompt while preserving semantic information, we design a novel reward function measuring toxicity and text alignment of generated images and train the optimizer through Proximal Policy Optimization.

Blocking Text-to-Image Generation

Paper
Code

Generative Multimodal Models are In-Context Learners

1 code implementation • 20 Dec 2023 • Quan Sun, Yufeng Cui, Xiaosong Zhang, Fan Zhang, Qiying Yu, Zhengxiong Luo, Yueze Wang, Yongming Rao, Jingjing Liu, Tiejun Huang, Xinlong Wang

The human ability to easily solve multimodal tasks in context (i. e., with only a few demonstrations or simple instructions), is what current multimodal systems have largely struggled to imitate.

Ranked #22 on Visual Question Answering on MM-Vet

In-Context Learning Question Answering +2

1,511

Paper
Code

Emu: Generative Pretraining in Multimodality

2 code implementations • 11 Jul 2023 • Quan Sun, Qiying Yu, Yufeng Cui, Fan Zhang, Xiaosong Zhang, Yueze Wang, Hongcheng Gao, Jingjing Liu, Tiejun Huang, Xinlong Wang

We present Emu, a Transformer-based multimodal foundation model, which can seamlessly generate images and texts in multimodal context.

Ranked #1 on Visual Question Answering on VQA v2

Image Captioning Temporal/Casual QA +4

1,511

Paper
Code

Fine-Grained Visual Prompting

1 code implementation • NeurIPS 2023 • Lingfeng Yang, Yueze Wang, Xiang Li, Xinlong Wang, Jian Yang

Previous works have suggested that incorporating visual prompts, such as colorful boxes or circles, can improve the ability of models to recognize objects of interest.

Visual Prompting

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.