Search Results for author: Yongshuo Zong

Found 7 papers, 7 papers with code

VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning

1 code implementation19 Mar 2024 Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales

Built on top of LLMs, vision large language models (VLLMs) have advanced significantly in areas such as recognition, reasoning, and grounding.

Benchmarking Image Captioning +3

Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models

1 code implementation3 Feb 2024 Yongshuo Zong, Ondrej Bohdal, Tingyang Yu, Yongxin Yang, Timothy Hospedales

Our experiments demonstrate that integrating this dataset into standard vision-language fine-tuning or utilizing it for post-hoc fine-tuning effectively safety aligns VLLMs.

Instruction Following

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

1 code implementation10 Oct 2023 Letian Zhang, Xiaotong Zhai, Zhongkai Zhao, Yongshuo Zong, Xin Wen, Bingchen Zhao

In light of the advancements in current multi-modal large language models, we explore their effectiveness in counterfactual reasoning.

Benchmarking Code Generation +4

Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations

1 code implementation2 Oct 2023 Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, Timothy Hospedales

Large language and vision-language models are rapidly being deployed in practice thanks to their impressive capabilities in instruction following, in-context learning, and so on.

In-Context Learning Instruction Following +3

Meta Omnium: A Benchmark for General-Purpose Learning-to-Learn

1 code implementation CVPR 2023 Ondrej Bohdal, Yinbing Tian, Yongshuo Zong, Ruchika Chavhan, Da Li, Henry Gouk, Li Guo, Timothy Hospedales

Meta-learning and other approaches to few-shot learning are widely studied for image recognition, and are increasingly applied to other vision tasks such as pose estimation and dense prediction.

Few-Shot Learning Pose Estimation +1

Self-Supervised Multimodal Learning: A Survey

1 code implementation31 Mar 2023 Yongshuo Zong, Oisin Mac Aodha, Timothy Hospedales

In this survey, we provide a comprehensive review of the state-of-the-art in SSML, in which we elucidate three major challenges intrinsic to self-supervised learning with multimodal data: (1) learning representations from multimodal data without labels, (2) fusion of different modalities, and (3) learning with unaligned data.

Machine Translation Self-Supervised Learning

MEDFAIR: Benchmarking Fairness for Medical Imaging

1 code implementation4 Oct 2022 Yongshuo Zong, Yongxin Yang, Timothy Hospedales

In this work, we introduce MEDFAIR, a framework to benchmark the fairness of machine learning models for medical imaging.

Benchmarking Fairness +2

Cannot find the paper you are looking for? You can Submit a new open access paper.