Search Results for author: Yongshuo Zong

Found 7 papers, 7 papers with code

VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning

1 code implementation • 19 Mar 2024 • Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales

Built on top of LLMs, vision large language models (VLLMs) have advanced significantly in areas such as recognition, reasoning, and grounding.

Benchmarking Image Captioning +3

Paper
Code

Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models

1 code implementation • 3 Feb 2024 • Yongshuo Zong, Ondrej Bohdal, Tingyang Yu, Yongxin Yang, Timothy Hospedales

Our experiments demonstrate that integrating this dataset into standard vision-language fine-tuning or utilizing it for post-hoc fine-tuning effectively safety aligns VLLMs.

Instruction Following

Paper
Code

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

1 code implementation • 10 Oct 2023 • Letian Zhang, Xiaotong Zhai, Zhongkai Zhao, Yongshuo Zong, Xin Wen, Bingchen Zhao

In light of the advancements in current multi-modal large language models, we explore their effectiveness in counterfactual reasoning.

Benchmarking Code Generation +4

Paper
Code

Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations

1 code implementation • 2 Oct 2023 • Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, Timothy Hospedales

Large language and vision-language models are rapidly being deployed in practice thanks to their impressive capabilities in instruction following, in-context learning, and so on.

In-Context Learning Instruction Following +3

Paper
Code

Meta Omnium: A Benchmark for General-Purpose Learning-to-Learn

1 code implementation • CVPR 2023 • Ondrej Bohdal, Yinbing Tian, Yongshuo Zong, Ruchika Chavhan, Da Li, Henry Gouk, Li Guo, Timothy Hospedales

Meta-learning and other approaches to few-shot learning are widely studied for image recognition, and are increasingly applied to other vision tasks such as pose estimation and dense prediction.

Few-Shot Learning Pose Estimation +1

Paper
Code

Self-Supervised Multimodal Learning: A Survey

1 code implementation • 31 Mar 2023 • Yongshuo Zong, Oisin Mac Aodha, Timothy Hospedales

In this survey, we provide a comprehensive review of the state-of-the-art in SSML, in which we elucidate three major challenges intrinsic to self-supervised learning with multimodal data: (1) learning representations from multimodal data without labels, (2) fusion of different modalities, and (3) learning with unaligned data.

Machine Translation Self-Supervised Learning

180

Paper
Code

MEDFAIR: Benchmarking Fairness for Medical Imaging

1 code implementation • 4 Oct 2022 • Yongshuo Zong, Yongxin Yang, Timothy Hospedales

In this work, we introduce MEDFAIR, a framework to benchmark the fairness of machine learning models for medical imaging.

Benchmarking Fairness +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.