Search Results for author: Yingshui Tan

Found 12 papers, 0 papers with code

RapGuard: Safeguarding Multimodal Large Language Models via Rationale-aware Defensive Prompting

no code implementations25 Dec 2024 Yilei Jiang, Yingshui Tan, Xiangyu Yue

While Multimodal Large Language Models (MLLMs) have made remarkable progress in vision-language reasoning, they are also more susceptible to producing harmful content compared to models that focus solely on text.

PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment

no code implementations18 Nov 2024 Zhendong Liu, Yuanbi Nie, Yingshui Tan, Jiaheng Liu, Xiangyu Yue, Qiushi Cui, Chongjun Wang, Xiaoyong Zhu, Bo Zheng

However, recent research shows that the visual modality in VLMs is highly vulnerable, allowing attackers to bypass safety alignment in LLMs through visually transmitted content, launching harmful attacks.

Language Modeling Language Modelling +1

Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment

no code implementations23 Oct 2024 Yanshi Li, Shaopan Xiong, Gengru Chen, Xiaoyang Li, Yijia Luo, Xingyao Zhang, Yanhui Huang, Xingyuan Bu, Yingshui Tan, Chun Yuan, Jiamang Wang, Wenbo Su, Bo Zheng

Our method improves the success rate on adversarial samples by 10\% compared to the sample-wise approach, and achieves a 1. 3\% improvement on evaluation benchmarks such as MMLU, GSM8K, HumanEval, etc.

GSM8K HumanEval +1

Safety Alignment for Vision Language Models

no code implementations22 May 2024 Zhendong Liu, Yuanbi Nie, Yingshui Tan, Xiangyu Yue, Qiushi Cui, Chongjun Wang, Xiaoyong Zhu, Bo Zheng

To address this issue, we enhance the existing VLMs' visual modality safety alignment by adding safety modules, including a safety projector, safety tokens, and a safety head, through a two-stage training process, effectively improving the model's defense against risky images.

Red Teaming Safety Alignment

Generalizing Fault Detection Against Domain Shifts Using Stratification-Aware Cross-Validation

no code implementations20 Aug 2020 Yingshui Tan, Baihong Jin, Qiushi Cui, Xiangyu Yue, Alberto Sangiovanni Vincentelli

Incipient anomalies present milder symptoms compared to severe ones, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions.

Anomaly Detection Ensemble Learning +1

Using Ensemble Classifiers to Detect Incipient Anomalies

no code implementations20 Aug 2020 Baihong Jin, Yingshui Tan, Albert Liu, Xiangyu Yue, Yuxin Chen, Alberto Sangiovanni Vincentelli

Incipient anomalies present milder symptoms compared to severe ones, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions.

Anomaly Detection Ensemble Learning

Are Ensemble Classifiers Powerful Enough for the Detection and Diagnosis of Intermediate-Severity Faults?

no code implementations7 Jul 2020 Baihong Jin, Yingshui Tan, Yuxin Chen, Kameshwar Poolla, Alberto Sangiovanni Vincentelli

Intermediate-Severity (IS) faults present milder symptoms compared to severe faults, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions.

Fault Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.