Search Results for author: Yunhao Gou

Found 5 papers, 1 papers with code

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

no code implementations • 14 Mar 2024 • Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Multimodal large language models (MLLMs) have shown impressive reasoning abilities, which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors.

Optical Character Recognition (OCR)

Paper
Add Code

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

no code implementations • 19 Dec 2023 • Yunhao Gou, Zhili Liu, Kai Chen, Lanqing Hong, Hang Xu, Aoxue Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks.

Instruction Following Zero-shot Generalization

Paper
Add Code

Leveraging per Image-Token Consistency for Vision-Language Pre-training

no code implementations • CVPR 2023 • Yunhao Gou, Tom Ko, Hansi Yang, James Kwok, Yu Zhang, Mingxuan Wang

(2) Under-utilization of the unmasked tokens: CMLM primarily focuses on the masked token but it cannot simultaneously leverage other tokens to learn vision-language associations.

Language Modelling Masked Language Modeling +1

Paper
Add Code

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification

1 code implementation • 2 Mar 2022 • Kai Yi, Xiaoqian Shen, Yunhao Gou, Mohamed Elhoseiny

The main question we address in this paper is how to scale up visual recognition of unseen classes, also known as zero-shot learning, to tens of thousands of categories as in the ImageNet-21K benchmark.

Image Classification Zero-Shot Image Classification +1

Paper
Code

Region Semantically Aligned Network for Zero-Shot Learning

no code implementations • 14 Oct 2021 • Ziyang Wang, Yunhao Gou, Jingjing Li, Yu Zhang, Yang Yang

Zero-shot learning (ZSL) aims to recognize unseen classes based on the knowledge of seen classes.

Attribute Transfer Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.