1 code implementation • 19 Mar 2024 • Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales
Built on top of LLMs, vision large language models (VLLMs) have advanced significantly in areas such as recognition, reasoning, and grounding.
1 code implementation • 3 Feb 2024 • Yongshuo Zong, Ondrej Bohdal, Tingyang Yu, Yongxin Yang, Timothy Hospedales
Our experiments demonstrate that integrating this dataset into standard vision-language fine-tuning or utilizing it for post-hoc fine-tuning effectively safety aligns VLLMs.
1 code implementation • 10 Oct 2023 • Letian Zhang, Xiaotong Zhai, Zhongkai Zhao, Yongshuo Zong, Xin Wen, Bingchen Zhao
In light of the advancements in current multi-modal large language models, we explore their effectiveness in counterfactual reasoning.
1 code implementation • 2 Oct 2023 • Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, Timothy Hospedales
Large language and vision-language models are rapidly being deployed in practice thanks to their impressive capabilities in instruction following, in-context learning, and so on.
1 code implementation • CVPR 2023 • Ondrej Bohdal, Yinbing Tian, Yongshuo Zong, Ruchika Chavhan, Da Li, Henry Gouk, Li Guo, Timothy Hospedales
Meta-learning and other approaches to few-shot learning are widely studied for image recognition, and are increasingly applied to other vision tasks such as pose estimation and dense prediction.
1 code implementation • 31 Mar 2023 • Yongshuo Zong, Oisin Mac Aodha, Timothy Hospedales
In this survey, we provide a comprehensive review of the state-of-the-art in SSML, in which we elucidate three major challenges intrinsic to self-supervised learning with multimodal data: (1) learning representations from multimodal data without labels, (2) fusion of different modalities, and (3) learning with unaligned data.
1 code implementation • 4 Oct 2022 • Yongshuo Zong, Yongxin Yang, Timothy Hospedales
In this work, we introduce MEDFAIR, a framework to benchmark the fairness of machine learning models for medical imaging.