Search Results for author: Yilei Jiang

Found 6 papers, 2 papers with code

HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States

1 code implementation20 Feb 2025 Yilei Jiang, Xinyan Gao, Tianshuo Peng, Yingshui Tan, Xiaoyong Zhu, Bo Zheng, Xiangyu Yue

The integration of additional modalities increases the susceptibility of large vision-language models (LVLMs) to safety risks, such as jailbreak attacks, compared to their language-only counterparts.

Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models

no code implementations17 Feb 2025 Yingshui Tan, Yilei Jiang, Yanshi Li, Jiaheng Liu, Xingyuan Bu, Wenbo Su, Xiangyu Yue, Xiaoyong Zhu, Bo Zheng

Fine-tuning large language models (LLMs) based on human preferences, commonly achieved through reinforcement learning from human feedback (RLHF), has been effective in improving their performance.

Safety Alignment

DebiasDiff: Debiasing Text-to-image Diffusion Models with Self-discovering Latent Attribute Directions

no code implementations25 Dec 2024 Yilei Jiang, Weihong Li, Yiyuan Zhang, Minghong Cai, Xiangyu Yue

Then, the distribution indicator is multiplied by the set of adapters to guide the generation process towards the prescribed distribution.

Attribute

RapGuard: Safeguarding Multimodal Large Language Models via Rationale-aware Defensive Prompting

no code implementations25 Dec 2024 Yilei Jiang, Yingshui Tan, Xiangyu Yue

While Multimodal Large Language Models (MLLMs) have made remarkable progress in vision-language reasoning, they are also more susceptible to producing harmful content compared to models that focus solely on text.

Event-Customized Image Generation

no code implementations3 Oct 2024 Zhen Wang, Yilei Jiang, Dong Zheng, Jun Xiao, Long Chen

To extend customized image generation to more complex scenes for general real-world applications, we propose a new task: event-customized image generation.

Denoising Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.