The MultiModal Safety Benchmark (MM-SafetyBench) is a comprehensive framework designed for conducting safety-critical evaluations of Multimodal Large Language Models (MLLMs). It addresses the security concerns surrounding MLLMs, which can be compromised by query-relevant images, as if the text query itself were malicious¹².
Here's a brief overview of MM-SafetyBench: - Purpose: It aims to evaluate the vulnerability of MLLMs to adversarial attacks that use images to manipulate model responses. - Dataset: The benchmark includes a dataset with 13 scenarios, resulting in a total of 5,040 text-image pairs. - Evaluation: It has been used to assess the safety of 12 state-of-the-art MLLMs, revealing their susceptibility to image-based manipulations¹. - Significance: The findings from MM-SafetyBench highlight the need for improved safety measures in open-source MLLMs to protect against potential malicious exploits².
(1) GitHub - isXinLiu/MM-SafetyBench. https://github.com/isXinLiu/MM-SafetyBench. (2) MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large .... https://arxiv.org/abs/2311.17600. (3) MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large .... https://paperswithcode.com/paper/query-relevant-images-jailbreak-large-multi. (4) Official github repo for SafetyBench, a comprehensive benchmark to .... https://github.com/thu-coai/SafetyBench. (5) undefined. https://doi.org/10.48550/arXiv.2311.17600.
Paper | Code | Results | Date | Stars |
---|