Search Results for author: Ziran Yang

Found 5 papers, 3 papers with code

From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding

1 code implementation9 Dec 2024 Yixiong Fang, Ziran Yang, Zhaorun Chen, Zhuokai Zhao, Jiawei Zhou

Large vision-language models (LVLMs) demonstrate remarkable capabilities in multimodal tasks but are prone to misinterpreting visual inputs, often resulting in hallucinations and unreliable outputs.

ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain

1 code implementation23 Nov 2024 Haochen Zhao, Xiangru Tang, Ziran Yang, Xiao Han, Xuanzhi Feng, Yueqing Fan, Senhao Cheng, Di Jin, Yilun Zhao, Arman Cohan, Mark Gerstein

To address this issue in the field of chemistry, we introduce ChemSafetyBench, a benchmark designed to evaluate the accuracy and safety of LLM responses.

Benchmarking Diversity

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

1 code implementation20 Jun 2024 Josef Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang

To mitigate the risk of harmful outputs from large vision models (LVMs), we introduce the SafeSora dataset to promote research on aligning text-to-video generation with human values.

Safety Alignment Text-to-Video Generation +2

Panacea: Pareto Alignment via Preference Adaptation for LLMs

no code implementations3 Feb 2024 Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Haojun Chen, Qingfu Zhang, Siyuan Qi, Yaodong Yang

Panacea trains a single model capable of adapting online and Pareto-optimally to diverse sets of preferences without the need for further tuning.

Language Modelling Large Language Model

Evolving Diverse Red-team Language Models in Multi-round Multi-agent Games

no code implementations30 Sep 2023 Chengdong Ma, Ziran Yang, Hai Ci, Jun Gao, Minquan Gao, Xuehai Pan, Yaodong Yang

Furthermore, we develop a Gamified Red Team Solver (GRTS) with diversity measures to mitigate mode collapse and theoretically guarantee the convergence of approximate Nash equilibrium which results in better strategies for both teams.

Diversity Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.